Skip to content

ADR 0021 — PURE — Production URL Replay Engine + Discovery Hub + PVI/PVP + PIE-PA

  • Status: Accepted (formalized 2026-05-12 with v3.7.0 — TS scaffolds Wave 4-5; Go enforcement layer shipping 2026-05-13 Tier-3 batch F at pkg/pie-pa-executor/ — patent claims #11/#12/#13 reduced-to-practice)
  • Date: 2026-05-10
  • Deciders: TLSStress.Art project
  • Targets: v5.x (Phase 1 Materialization scaffolds: PURE-1..10 already merged Wave 4-5)
  • Patent claim family: claims #6..#13 (PURE + Discovery Hub + PVI/PVP + PIE-PA)

Context

The bench's Test Kinds 1-6 cover synthetic + lab scenarios. Customers asked: "Can you replay MY production URLs through MY DUT and validate that nothing breaks?"

Two pieces missing: 1. Discovery — sourcing the URL set from the customer's real environment (Syslog / PCAP / HAR / API / Curated lists) 2. Production-safety — replaying real URLs against a production DUT carries MITM risk because bench personas host real public IPs (200.130.x.x range per project_ip_addressing_v43)

Decision

Introduce PURE (Production URL Replay Engine) as Test Kind #7, plus the Discovery Hub, PVI/PVP validators, and the PIE-PA 3-layer defense.

Discovery Hub — 8 sources

# Source Origin
A Syslog NGFW syslog ingestion (Cisco FTD URL field parser)
B Vendor API PAN Cortex + Cisco SCC direct REST query
C PCAP gopacket-based URL extraction
D HAR browser-recorded session replay
E Curated pre-curated Tranco/Umbrella/Majestic monthly snapshot + cuts
F SPAN live mirror via MÓDULO SPAN.Art (richest source)
G Cloud-derived from operator OBP cloud-egress observations
H KALI nmap import operator-supplied scan results

PVI — Pre-flight Validation, Ingestion-time

Every URL ingested to the Discovery Hub passes through PVI before hitting the test plan generator. PVI is run by CLONER fn #9 via ephemeral K6/Playwright probe pods, in 3-stage cascade:

Stage 1 — HEAD probe (cheap, ~50ms)
  is the URL alive at all?
  internet-direct via CLONER egress (NOT bench data plane)
  drop URL if 4xx/5xx

Stage 2 — TLS handshake (~200ms)
  cert valid? SNI match? TLS version negotiated?
  drop URL if cert chain rejected

Stage 3 — Full HTTP fetch (~500ms-3s)
  navigate via PW; capture protocol, response size, CSP
  drop URL if heuristic fails (e.g. requires login wall)

K-anonymity ≥ 10 enforced across the PVI batch: URLs with fewer than 10 distinct customer Syslog mentions are dropped (privacy protection).

PVP — Pre-flight Validation, Pre-test

Right before each PURE run, PVP checks the DUT-delta scope: which URLs the DUT can plausibly inspect (vs URLs already cached by customer CDN that the DUT will never see).

PIE-PA — 3-layer defense for PURE in production

MANDATORY in production DOM mode (per ADR 0014). PURE replay against production DUTs would otherwise risk MITM because personas serve real public IPs:

Layer 1 — Pod scale-to-0
  Bench persona pods (172.19.0.0/16 + 10.1.0.0/16 + 10.2.0.0/16)
  scaled to 0 replicas. No bench artifact answering on real public IPs.

Layer 2 — BGP withdraw
  Bench BGP advertisements (synthetic prefixes from MÓDULO BGP-1..4)
  withdrawn from upstream peers. No path to bench from real Internet.

Layer 3 — DNS sanity check
  External resolver (8.8.8.8) queried for each PURE URL.
  Result MUST resolve to real-world IP, NOT bench (200.130.x.x).
  If ANY URL resolves to bench → abort production PURE run.

DDPB chain layer 5 (RELAY pre-flight) gates production PURE runs on all 3 layers being green.

CLONER fn count: 9 (was 7)

PURE introduces 2 new CLONER functions: - fn #8 — Cloud proxy (used by OBP for semi-air-gap operator Internet egress) - fn #9 — PVI orchestrator (spins up K6/PW probe pods, drives the 3-stage validation cascade)

Updated CLONER catalog (per discuss_cloner_platform_2026_05_08): 1. Web clone 2. NTP server 3. Feedback channel 4. Catalog refresh 5. API discovery 6. Patch + TopURL fetch 7. Upgrade channel poll 8. Cloud proxy (NEW) 9. PVI orchestrator (NEW)

Architecture

Discovery → PVI → Test Plan flow

operator → Discovery Hub UI
        → picks source(s) A/B/C/D/E/F/G/H
        → batch URL list emitted
        ↓
  CLONER fn #9 — PVI orchestrator
  ├ Stage 1 HEAD probe
  ├ Stage 2 TLS handshake
  └ Stage 3 full HTTP fetch
        ↓
  k-anonymity ≥ 10 filter
        ↓
  Test Plan generator (test_kind=pure)

PURE run → PIE-PA gate (production mode)

operator → "run PURE" → DOM-aware check
                     → if production:
                         ├ PIE-PA Layer 1 (pod scale-to-0)
                         ├ PIE-PA Layer 2 (BGP withdraw)
                         └ PIE-PA Layer 3 (DNS sanity)
                           all green? → run
                           any red?   → abort + audit
                     → PVP DUT-delta scope check
                     → execute (replay URLs via PW/K6 agents → DUT)
                     → restore (pods up, BGP re-advertise)

Consequences

Pros

  • Test Kind #7 closes "real customer URL replay" customer ask
  • 8-source Discovery Hub = differentiator vs single-source competitors
  • PIE-PA = production-safe (patent moat)
  • Cross-vendor URL parsing (Forti / PAN / CKP / Sophos) = breadth
  • 13 patent claims in PURE family

Cons / risks

  • PIE-PA layer 1 (pod scale-to-0) means bench is unavailable for other tests during a production PURE run (acceptable trade-off)
  • Curated list refresh requires Internet (CLONER fn #4) — air-gap benches fall back to bundled Tranco snapshot
  • K-anonymity ≥ 10 may eliminate small-customer-specific URLs (by design — privacy over completeness)

Compatibility

  • Air-gap deployments: source E (curated bundled snapshot) only; sources A/B/F/G remain available with operator-supplied data
  • Pre-Wave-4 deployments: Discovery Hub unavailable; manual URL list upload supported as fallback

References

  • Memory: discuss_pure_real_url_replay_2026_05_10.md
  • Memory: discuss_oobi_immutable_gateway_art_2026_05_10.md (3 trust zones interplay with PVI internet egress)
  • Code: dashboard/src/lib/pure/ (PURE-1..10 scaffolds, Wave 4-5)
  • ADR cross-ref: 0014 (DOM modes — PIE-PA gates), 0019 (OOBI trust zones — CLONER egress trust), 0018 (Compliance — k-anonymity)
  • Patent claims: #6..#13