Skip to content

HAR Replay (L7 application) — primer

Help Center primer for the HAR session replay engine. Pairs with the har-engine/ module and ADR 0021 (PURE — HAR is one PURE ingestion source).

What it tests

Browser-engine agents are the gold standard for realism but do not scale arbitrarily: each agent costs hundreds of MB of RAM + a real Chromium process. A HAR-replay engine costs per session — orders of magnitude cheaper:

  • Browser-engine agents: ~50 sessions / host realistic ceiling
  • HAR replay: ~10 000 sessions / host realistic ceiling

The trade is realism: HAR replay reuses captured network sequences, so it cannot fire new dynamic flows. For regression and capacity testing against an inspection profile, this is the right tool.

What HAR is

HAR (HTTP Archive) is a JSON format every browser DevTools and proxy can emit. Each HAR captures:

  • All HTTP/2 + HTTP/3 requests in a session
  • Headers, status codes, timings (DNS, connect, TLS, wait, receive)
  • Response sizes (bodies stripped for privacy)

The engine reads a HAR, materialises N parallel session replays against a target URL set, and lets the DUT inspect them.

Three-axis configuration

Axis Options
har_source path under cloned-personas/*.har / inline JSON / Discovery Hub feed
target_concurrency 100 / 1000 / 5000 (default) / 10000
loop_mode once / continuous (default) — re-loop HAR for sustained load

The dashboard pre-validates (HAR file, target_concurrency) to warn if memory headroom is insufficient.

Privacy — captured-at-customer-site assumption

A HAR captured on the customer's prod network may contain real URLs, cookies, and tokens. CLONER Function 9 + the pkg/pcap-extractor/ extractor strip src IP fields before the HAR is admitted to the test bench (mirrors the RELAY ingress contract). Cookies + auth headers are scrubbed during HAR ingestion (PURE-5a follow-up implementing rule-based scrub).

Layered vs standalone

  • Standalone: test_kind = pure with HAR as the URL source.
  • Layered: enable har_layered modifier on any other test to layer realistic L7 sessions on top of L4 / L7 baselines.

Reading the report

Each HAR run adds an "Annex L (HAR)" block:

  • Source → HAR file name + capture date
  • Run config → 3 axes
  • Throughput envelope → sustained sessions/sec + bytes/sec
  • WAF activity → DUT WAF rule firings, classified by vendor (Cisco FTD / Palo Alto / Fortinet / Check Point / generic catch-all on suspicious-URI heuristic). Exported as Prometheus metric tlsstress_engine_har_waf_rule_fired_total{vendor, rule_id}. See har-engine/internal/waf/. Syslog-source detection is a Wave-B follow-up.
  • Application regression → response-code distribution delta vs HAR reference

Common patterns

Symptom Likely cause
Session count drops mid-run DUT inspection profile changed at a session boundary
WAF rule firing > baseline DUT WAF tuning drifted — primary regression signal
Response-code skew → 5xx growth DUT backend health degrading under load — capture for sales
Replay errors "session out of HAR" HAR file too short — switch loop_mode=continuous

Last verified against shipping code: v3.7.0 (2026-05-12).