Skip to content

PIE family — primer

Help Center primer for the Production Integration Envelope family — operator-facing safety + automation surfaces around the DOM core. Pairs with ADR 0014.

What it is

PIE is a family of helper subsystems that surround the core DOM discriminator + DDPB chain. They handle the operator-experience side of production safety: replays, restores, health, and auto-approval.

The 4 PIE subsystems

RSM — Replay Session Mode

Re-runs a previous test exactly as it was executed. Useful for: - Reproducing a customer-reported regression - Validating a fix didn't break the original test - Audit replay (compliance forensics)

UI: /admin/runs/<id>/replay → "Replay" button → fresh run with same plan, same DUT, same persona set.

IR — Idempotent Restore

Restores the bench to a known checkpoint without triggering side-effects. Distinct from RSM (which actually runs traffic) — IR just resets state.

When to use: after a failed CPOS commit + rollback didn't fully clean up; before a critical demo to ensure clean baseline.

HID — Health Indicator Dashboard

At-a-glance dashboard of every component's health. Three columns: - Green — running, last-seen recently - Yellow — degraded but operational - Red — failed, intervention needed

Lives at /admin/health. Refresh: 5s. Powered by Prometheus alerts + K8s readiness probes.

AAE — Auto-Approve Engine

Per-mode policy engine that decides which low-risk operations get auto-approved vs require operator click-through.

Default policy: - greenfield — auto-approve everything - staging / lab — auto-approve reads + warn-on-writes - production — auto-approve reads only - prod-partition — same as production except for the carved slice

Operators with admin role can edit the policy per-mode. Policy changes are audit-logged.

How they fit together

operator request
      ↓
DOM (mode classifier)
      ↓
AAE (auto-approve gate based on mode + op risk)
      │
      ├─ approved → DDPB chain (production-mode only) → execute
      │                                              → HID updates
      │                                              → audit log
      │
      └─ needs human → present to operator
                    → operator clicks → execute
                                     → HID updates
                                     → audit log

later: operator can RSM the run (replay) or IR (restore baseline)

Common patterns

You want to... Use...
Re-run yesterday's test exactly RSM
Reset to a known-good baseline IR
Glance-check bench health HID
Reduce click fatigue for routine ops AAE policy edit

Common questions

Does AAE bypass DDPB? No. AAE pre-approves the operator click; DDPB still gates the actual execution against DOM mode.

Can I disable HID alerts that are too noisy? Tune at /admin/health/alerts — alert thresholds are operator-tunable.

Does IR roll back PURE production runs? No. PURE PIE-PA gate ensures production runs are bounded; IR is for bench-side state only.