Skip to content

DOM (DUT Operating Mode) — primer

Help Center primer for the DOM family — the safety system that gates production-blocking operations. Pairs with ADR 0014.

What it does

Every DUT in the bench is classified into one of 5 operating modes. Each mode has a different safety profile — production mode hard- blocks any state-changing operation that could disturb live customer traffic.

This is the safety net that makes pointing the bench at a real customer NGFW a sane thing to do.

The 5 modes

Mode When to use Destructive ops?
greenfield Brand-new DUT, no production traffic, free to break yes
staging Pre-prod mirror — high-realism but can recover yes (warn-only)
lab Dedicated lab DUT (your bench's home turf) yes (warn-only)
production Live customer DUT carrying real flows NO — blocked
prod-partition Production DUT, but operator carved a quarantined slice partial — explicit unlock per op

Default for new DUTs added to inventory: lab.

How to set / change a DUT's mode

Dashboard: /admin/dut/<id>/mode → dropdown picker. Mode changes are audit-logged.

CLI: kubectl annotate dut <name> tlsstress.art/dom-mode=staging (emergency only — UI is preferred so the audit trail is rich).

What gets blocked in production mode

The DDPB chain (Defense in Depth Production Blocking) enforces 7 layers:

  1. UI gate — destructive buttons are greyed out
  2. API middleware — POST/PUT/DELETE requests rejected with 403
  3. DB constraint — write attempts to test-plan tables rejected
  4. K8s admission webhook — pod scaling / config-map writes blocked
  5. RELAY pre-flight — RELAY.Art refuses to forward write commands
  6. DUT-side BTO check — bidirectional trust orchestration confirms no overlapping intent
  7. Audit trail before-and-after — every blocked attempt logged

Layers 1-4 fail-fast at the operator's keyboard; layers 5-7 catch edge cases the operator-side missed.

What still works in production

  • All read-only operations (show-version, telemetry, observation)
  • PURE Test Kind (Production URL Replay) with PIE-PA gate green
  • KALI / DoYour offensive tools? No — explicitly hard-blocked
  • BGP saturation against the production DUT? No — needs unlock

Drift detection (PDD)

PDD watches for mode drift: if you labeled the DUT lab but the DUT shows live BGP sessions or production-rate syslog, PDD raises a warning and offers to re-classify.

CPOS — atomic 2-phase commit

When you push a config change that touches multiple tiers (Personas + DUT + Agents), CPOS coordinates:

PREPARE phase
  every tier acks "can do" → green light
  any nack → abort, no state change

COMMIT phase
  every tier snapshots + applies
  60s self-healing watchdog
  any tier unhealthy → rollback all snapshots

This means a partial config push cannot leave the bench in a half-applied state.

PIE family

Auxiliary subsystems that work alongside DOM:

  • RSM (Replay Session Mode) — replay a past test exactly
  • IR (Idempotent Restore) — restore checkpoint w/o side-effects
  • HID (Health Indicator Dashboard) — operator's at-a-glance view
  • AAE (Auto-Approve Engine) — pre-approves low-risk operations per per-mode policy

Common questions

Can I disable DDPB to get something through? No. The chain is designed to be tamper-resistant — every layer logs to a separate audit channel. If you legitimately need to bypass, switch the DUT to prod-partition and use the explicit unlock window.

Does PDD ever auto-reclassify? No — it only suggests. Operators must confirm.

My team has 50 DUTs. Do I need to set mode 50 times? Modes can be inherited from a template (Profile Templates v2.0, DOM-6).