HAR Replay (L7 application) — primer¶

Help Center primer for the HAR session replay engine. Pairs with the har-engine/ module and ADR 0021 (PURE — HAR is one PURE ingestion source).

What it tests¶

Browser-engine agents are the gold standard for realism but do not scale arbitrarily: each agent costs hundreds of MB of RAM + a real Chromium process. A HAR-replay engine costs per session — orders of magnitude cheaper:

Browser-engine agents: ~50 sessions / host realistic ceiling
HAR replay: ~10 000 sessions / host realistic ceiling

The trade is realism: HAR replay reuses captured network sequences, so it cannot fire new dynamic flows. For regression and capacity testing against an inspection profile, this is the right tool.

What HAR is¶

HAR (HTTP Archive) is a JSON format every browser DevTools and proxy can emit. Each HAR captures:

All HTTP/2 + HTTP/3 requests in a session
Headers, status codes, timings (DNS, connect, TLS, wait, receive)
Response sizes (bodies stripped for privacy)

The engine reads a HAR, materialises N parallel session replays against a target URL set, and lets the DUT inspect them.

Three-axis configuration¶

Axis	Options
`har_source`	path under `cloned-personas/*.har` / inline JSON / Discovery Hub feed
`target_concurrency`	100 / 1000 / 5000 (default) / 10000
`loop_mode`	once / continuous (default) — re-loop HAR for sustained load

The dashboard pre-validates (HAR file, target_concurrency) to warn if memory headroom is insufficient.

Privacy — captured-at-customer-site assumption¶

A HAR captured on the customer's prod network may contain real URLs, cookies, and tokens. CLONER Function 9 + the pkg/pcap-extractor/ extractor strip src IP fields before the HAR is admitted to the test bench (mirrors the RELAY ingress contract). Cookies + auth headers are scrubbed during HAR ingestion (PURE-5a follow-up implementing rule-based scrub).

Layered vs standalone¶

Standalone: test_kind = pure with HAR as the URL source.
Layered: enable har_layered modifier on any other test to layer realistic L7 sessions on top of L4 / L7 baselines.

Reading the report¶

Each HAR run adds an "Annex L (HAR)" block:

Source → HAR file name + capture date
Run config → 3 axes
Throughput envelope → sustained sessions/sec + bytes/sec
WAF activity → DUT WAF rule firings, classified by vendor (Cisco FTD / Palo Alto / Fortinet / Check Point / generic catch-all on suspicious-URI heuristic). Exported as Prometheus metric tlsstress_engine_har_waf_rule_fired_total{vendor, rule_id}. See har-engine/internal/waf/. Syslog-source detection is a Wave-B follow-up.
Application regression → response-code distribution delta vs HAR reference

Common patterns¶

Symptom	Likely cause
Session count drops mid-run	DUT inspection profile changed at a session boundary
WAF rule firing > baseline	DUT WAF tuning drifted — primary regression signal
Response-code skew → 5xx growth	DUT backend health degrading under load — capture for sales
Replay errors "session out of HAR"	HAR file too short — switch `loop_mode=continuous`

har-engine/ — module README
ADR 0021 — PURE design
PURE primer: pure-production-url-replay
STRESS_ENGINES_CATALOG — engine matrix

Last verified against shipping code: v3.7.0 (2026-05-12).