Skip to content

CPOS — Customizable Profile / Override Stack — primer

Help Center primer for CPOS — the atomic 2-phase commit system that coordinates config changes across Personas + DUT + Agents. Pairs with ADR 0014.

What it is

When you push a config change that touches more than one tier of the bench (e.g. "switch persona TLS profile + push matching inspection profile to DUT + restart agents"), CPOS coordinates the change so it either lands cleanly across all tiers or rolls back entirely.

No half-applied state. Ever.

The 2-phase protocol

Phase 1 — PREPARE

Each participating tier acks "I can do this":

TierAdapter[Personas].prepare(payload) → ack/nack
TierAdapter[DUT].prepare(payload)      → ack/nack
TierAdapter[Agents].prepare(payload)   → ack/nack

If any tier nacks → abort. No tier changes state.

Phase 2 — COMMIT

Each tier snapshots current state, applies the change, and starts a 60s self-healing watchdog:

TierAdapter[Personas].commit() → snapshot taken, change applied
TierAdapter[DUT].commit()      → snapshot taken, change applied
TierAdapter[Agents].commit()   → snapshot taken, change applied

60s watchdog

If any tier reports unhealthy within 60s of commit, all tiers roll back to their snapshots. Bench returns to pre-commit state.

If all 3 tiers stay healthy for 60s, the snapshots are finalized and the change is permanent.

Tier adapters

Each tier implements TierAdapter interface:

interface TierAdapter {
  prepare(payload): Promise<{ ok: boolean; reason?: string }>;
  commit(): Promise<{ snapshot_id: string }>;
  rollback(snapshot_id): Promise<void>;
  healthCheck(): Promise<{ healthy: boolean }>;
}

3 default tiers ship with the bench: - Personas — Caddy + persona-ca-issuer cert rotation - DUT — vendor-specific REST/SSH push (FMC, FortiOS, PAN-OS, etc.) - Agents — browser-engine + synthetic-load hot config reload

You can register additional tiers (e.g. SDWAN routers) by implementing the interface.

Why 2-phase + watchdog

Without 2PC: if you push to Personas successfully but DUT push fails mid-flight, Personas are now expecting a TLS profile that DUT doesn't match. Test results are corrupted.

Without watchdog: if DUT applied the change but the change broke DUT in a subtle way (e.g. flap), no auto-recovery — operator must roll back manually.

The combination guarantees: 1. All-or-nothing at apply time 2. Self-healing post-apply if config turns out broken

Common patterns

Scenario What CPOS does
Update inspection profile + persona TLS to match All 3 tiers prepare → commit together → 60s watchdog → either OK or full rollback
Switch one persona's archetype (real-app → skin) Personas prepare succeeds; DUT no-op; Agents no-op → commit Personas only
DUT vendor REST returns 500 mid-commit snapshots rolled back; original config restored; alert raised
Agents fail health post-commit DUT + Personas roll back; Agents marked unhealthy; alert raised

DOM-aware

CPOS respects DOM modes (per ADR 0014): - production mode → CPOS commits require DDPB chain unlock - prod-partition → CPOS commits scoped to the carved slice only - Other modes → standard CPOS flow

Common questions

What if my custom tier doesn't have a snapshot mechanism? The TierAdapter contract is yours to implement — for tiers without native snapshot, capture a config dump in commit() and re-push in rollback(). Not as clean as native snapshots, but functional.

Can I extend the watchdog beyond 60s? Yes — commit() accepts an watchdog_seconds override. The default 60s balances "long enough to detect real issues" vs "short enough not to lock the bench."

Does CPOS work without DOM? No — CPOS is part of the DOM family and assumes DOM is loaded.