ADR 0001 — Mandatory TLS 1.3 with strict certificate validation (per-target overrides since 2026-05)¶
- Status: Accepted (amended 2026-05-02 with per-target exceptions)
- Date: 2026-04-26 (initial), 2026-05-02 (amendment)
- Deciders: André Luiz Gallon
Context¶
The agent has to load real public sites (g1, NASA, Cisco, …) on behalf of operators that may use the data for performance baselines. Anything below TLS 1.3 is increasingly considered weak (RFC 9325 deprecates TLS 1.2 KEX modes that lack forward secrecy when negotiated with legacy ciphers), and ignoring certificate errors silently is a footgun that hides MITM.
Decision¶
- The agent's Chromium is launched with
--ssl-version-min=tls1.3. - The navigation response is inspected; if
securityDetails().protocolis notTLS 1.3, the cycle is aborted witherror_messagerecorded. ignoreHTTPSErrorsis forced tofalse(Playwright defaulttrue).- The dashboard's
httpsOnlyflag rejects non-HTTPS targets at the validator (onlyhttps://URLs are accepted onPOST /api/admin/targets).
Consequences¶
- ✅ The cluster cannot be used to bypass certificate hygiene.
- ✅ Cycles that previously "succeeded" against a broken TLS chain now fail loudly, surfacing real production risk earlier.
- ⚠️ A handful of legacy sites that still negotiate TLS 1.2 cannot be
monitored. Operators that need this can opt out with
REQUIRE_TLS13=false, but the option is documented as an explicit security trade-off.
Alternatives considered¶
- Allow TLS 1.2 but block weak ciphers — chose to keep the policy simple. TLS 1.3 is universal in 2026.
- Disable the check via UI toggle per target — initially rejected
for fear of false signals; later adopted as a controlled
exception (see Amendment below) once the internal benchmark
fleet (
web-agent-webserver/ Caddytls internal) made the blanket policy genuinely impractical for lab use.
Amendment (2026-05-02) — per-target TLS overrides¶
Context¶
The introduction of the web-agent-webserver Caddy fleet (HTTP/2 +
HTTP/3 benchmark targets, scaled 1..20 from the dashboard) made the
original blanket policy untenable for lab use:
- Caddy's
tls internaldirective generates per-host self-signed certificates from a CA that nobody else trusts. Every cycle against the internal fleet would fail withnet::ERR_CERT_AUTHORITY_INVALID. - The internal fleet uses Docker DNS aliases like
webserverand*.ai_forse.local, which the dashboard'sPRIVATE_HOST_RESSRF safety net already blocked at create-time onPOST /api/admin/targets. - Both restrictions are correct for the public-monitoring use case (g1, NASA, Cisco, …), but become a hard wall for the lab use case the same UI is supposed to also support.
A global env-var kill-switch (REJECT_INVALID_CERTS=false) was
explicitly rejected — it would degrade posture for every target,
not just the lab ones. We needed per-row granularity that the
operator can see in the UI and that auditors can review.
Decision¶
Add two per-target columns (migration 0012_target_tls_overrides.sql):
allow_insecure_tls boolean NOT NULL DEFAULT false— whentrue, the agent passesignoreHTTPSErrors: trueto Playwright for this target only. The cluster-wideREJECT_INVALID_CERTSenv still defaults totrueand still applies to every target whose per-row flag isfalse.tls_min_version text NOT NULL DEFAULT 'tls1.3' CHECK (… IN ('tls1.2','tls1.3','any'))— per-target floor on the negotiated TLS version.
Plumbing:
POST /api/admin/targetsaccepts both new fields. The private-hostname validator is only bypassed when the request also setsallowInsecureTls=true(see code comments for the SSRF rationale).- The same logic applies to
PATCH /api/admin/targets/[id]. - The agent's
TargetZod schema gains the two fields with safe defaults. The runner combinestarget.tlsMinVersion(authoritative) with the legacyREQUIRE_TLS13env (now a paranoid safety net that silently upgrades'any'→'tls1.2'). - Every cycle that runs with the relaxed posture emits a structured
warnlog (allowInsecureTls: true, reason: 'per-target opt-in (ADR-0001 exception)') so the SIEM / dashboard can list every target running outside default policy. - The dashboard UI shows two badges (
⚠ TLS labandTLS 1.2+/TLS qualquer) on any target with non-default values, plus controls in the create + edit forms with red-tinted styling and explicit copy ("Lab interno apenas").
Consequences¶
- ✅ Lab fleets with self-signed CAs are usable from the same UI that drives public-site monitoring, without weakening the default posture for anything the operator hasn't explicitly opted in.
- ✅ Every exception is visible in the database (queryable),
in the UI (badged on the row), and in the audit log (structured
warnper cycle). - ⚠️ A misconfigured target marked
allow_insecure_tls=trueagainst a public site silently degrades that target's TLS validation — the badge in the UI is the operator's only signal there. Mitigated by red colouring + the explicit "lab interno apenas" copy in the form.
Migration impact¶
- Existing rows get the strict defaults
(false, 'tls1.3'), byte-for-byte preserving the original ADR-0001 behaviour. - No code change required for callers that ignore the new fields — the agent's Zod schema makes both optional with defaults.
- The legacy
REQUIRE_TLS13env is kept for backwards compatibility but its meaning is narrowed: it now only acts as a floor on per-targettlsMinVersion='any'(silently upgrading it to'tls1.2'). Setting it tofalseno longer disables the per- target check.