DUT API operations — registering devices and the polling worker¶
Read in your language: English · Português · Español
Scope status (post-Scope-Freeze 2026-05-10) — See ARCHITECTURE.md for the canonical 37 MÓDULOs + 7 Test Kinds + DOM/CPOS/PIE-PA safety architecture. ADRs 0014, 0019-0025 cover post-Freeze additions. This is the operator-facing guide for the DUT API integration. Read
DUT_API_INTEGRATION.mdfirst for the architecture; readAPI_FEATURE_CATALOG.mdfor what features the API integration unlocks. This document tells you how to actually USE it: how to register a device, how the polling worker behaves, how to trigger a manual snapshot, and how to debug.
Polling worker — what it does¶
Once a device is registered, a polling worker inside the dashboard pod automatically queries it on a schedule. The worker:
- Runs every 30 seconds (scan interval — does not mean every device is polled every 30s)
- Selects devices where
last_poll_at + poll_interval_seconds < now()ORlast_poll_at IS NULL - For each due device:
- Decrypts the password using
DUT_CRED_ENC_KEY - Instantiates the appropriate adapter (Cisco FTD, Cisco Nexus, Cisco UCS, Fortinet)
- Calls the adapter's
collectAll*()method - Persists each snapshot to
dut_api_snapshotswith SHA-256 - Updates the device row's
last_poll_at,last_poll_status,last_poll_error - Multi-replica safe via Postgres advisory lock: if two dashboard pods are running, only one polls at a time
Disabling the worker¶
# Set the env var in the dashboard's K8s Deployment:
DUT_API_POLLER_DISABLE=1
The worker logs a one-shot disable notice at startup and returns. Manual snapshots via the API endpoint still work.
Default polling interval¶
poll_interval_seconds defaults to 300 (5 min) at registration. Per-device override is allowed (minimum 30 s). Recommended values:
| Use case | Interval |
|---|---|
| Background inventory refresh | 300 s (default) |
| Active engagement, "DUT Live State" dashboard | 60 s |
| Forensic baseline runs (long-running SOAK) | 30 s |
| Pre-flight check moment (one-shot, not periodic) | use POST /snapshot endpoint instead |
One-time setup — encryption key¶
Credentials are stored AES-256-GCM-encrypted with DUT_CRED_ENC_KEY (env var). Before registering the first device, provision the key:
# 1. Generate the key
KEY=$(node -e "console.log(require('crypto').randomBytes(32).toString('hex'))")
# 2. Provision via K8s Secret
kubectl create secret generic tlsstress-dut-cred \
--from-literal=DUT_CRED_ENC_KEY="$KEY" \
--namespace web-agents
# 3. Wire to the dashboard Deployment env
kubectl patch deployment dashboard --namespace web-agents \
--patch='{"spec":{"template":{"spec":{"containers":[{"name":"dashboard","env":[{"name":"DUT_CRED_ENC_KEY","valueFrom":{"secretKeyRef":{"name":"tlsstress-dut-cred","key":"DUT_CRED_ENC_KEY"}}}]}]}}}}'
# 4. Restart the dashboard to pick up the env var
kubectl rollout restart deployment dashboard --namespace web-agents
API — registering devices¶
All endpoints require admin auth (the same auth the rest of the admin UI uses).
List vendors + devices¶
GET /api/admin/dut/devices
Response:
{
"devices": [
{ "id": "...", "hostname": "ftd-1.lab", "vendor": "cisco-ftd", "lastPollStatus": "ok", ... }
],
"vendors": [
{ "key": "cisco-ftd", "displayName": "Cisco FTD (FDM API)", "available": true },
{ "key": "cisco-nexus", "displayName": "Cisco Nexus 9000 (NX-API DME)", "available": true },
{ "key": "cisco-ucs-cimc", "displayName": "Cisco UCS C-Series CIMC", "available": true },
{ "key": "fortinet-fortigate", "displayName": "Fortinet FortiGate (REST v2)", "available": true },
{ "key": "palo-alto-panos", "displayName": "Palo Alto (not yet implemented)","available": false }
]
}
Register a Cisco FTD¶
curl -X POST "https://dashboard.example/api/admin/dut/devices" \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"hostname": "ftd-1.lab.example.com",
"vendor": "cisco-ftd",
"deviceRole": "ngfw",
"baseUrl": "https://10.10.10.1",
"username": "admin",
"password": "<the-ftd-password>",
"tlsVerifyMode": "self-signed",
"pollIntervalSeconds": 300,
"notes": "Cisco FTD 7.4 on FPR1010 — primary in HA pair"
}'
# {"ok": true, "deviceId": "..."}
Register a Fortinet FortiGate (with API token)¶
curl -X POST "https://dashboard.example/api/admin/dut/devices" \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-d '{
"hostname": "fortigate-1.lab.example.com",
"vendor": "fortinet-fortigate",
"deviceRole": "ngfw",
"baseUrl": "https://10.10.10.5",
"username": "apitoken",
"password": "<the-api-token>",
"tlsVerifyMode": "strict",
"pollIntervalSeconds": 300
}'
Note: setting username to the literal apitoken signals the adapter to use API-token auth (Bearer); otherwise it falls back to cookie-based session auth.
Register a Cisco Nexus 9000¶
# Pre-requisite on the switch:
# configure terminal
# feature nxapi
# nxapi https port 443
# nxapi use-vrf management
# end
curl -X POST "https://dashboard.example/api/admin/dut/devices" \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-d '{
"hostname": "nexus-1.lab.example.com",
"vendor": "cisco-nexus",
"deviceRole": "switch",
"baseUrl": "https://10.10.10.10",
"username": "admin",
"password": "<password>",
"tlsVerifyMode": "self-signed",
"pollIntervalSeconds": 300
}'
Test the connection¶
curl -X POST "https://dashboard.example/api/admin/dut/devices/<id>/test" \
-H "Authorization: Bearer $ADMIN_TOKEN"
# {"ok": true, "detail": "auth OK; CPU monitor reachable in 142ms", "latencyMs": 142}
If ok: false, the detail field tells you what failed (auth, TLS, timeout, HTTP error).
Trigger a manual snapshot¶
For ad-hoc inspection or right before a critical run:
curl -X POST "https://dashboard.example/api/admin/dut/devices/<id>/snapshot" \
-H "Authorization: Bearer $ADMIN_TOKEN"
# {
# "ok": true,
# "lastPollStatus": "ok",
# "lastPollError": "collected 7 snapshot(s)",
# "lastPollAt": "2026-05-06T14:35:00.000Z"
# }
This calls pollOneDevice() synchronously; the response confirms how many snapshots were collected.
Browse snapshots¶
# Latest 100 snapshots across all devices
curl "https://dashboard.example/api/admin/dut/snapshots" \
-H "Authorization: Bearer $ADMIN_TOKEN"
# Filter by device + endpoint label
curl "https://dashboard.example/api/admin/dut/snapshots?deviceId=...&endpointLabel=decrypt_policy" \
-H "Authorization: Bearer $ADMIN_TOKEN"
# Snapshots for a specific test run
curl "https://dashboard.example/api/admin/dut/snapshots?testRunExecutionId=<uuid>" \
-H "Authorization: Bearer $ADMIN_TOKEN"
The response includes payloadSha256 for each — the hash anchors the Test Run Report annexes.
Update a device¶
# Disable polling temporarily
curl -X PATCH "https://dashboard.example/api/admin/dut/devices/<id>" \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-d '{"enabled": false}'
# Rotate the password
curl -X PATCH "https://dashboard.example/api/admin/dut/devices/<id>" \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-d '{"password": "<new-password>"}'
# Change polling cadence
curl -X PATCH "https://dashboard.example/api/admin/dut/devices/<id>" \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-d '{"pollIntervalSeconds": 60}'
Delete a device¶
curl -X DELETE "https://dashboard.example/api/admin/dut/devices/<id>" \
-H "Authorization: Bearer $ADMIN_TOKEN"
The cascade also deletes all of the device's historical snapshots (FK ON DELETE CASCADE in the migration).
Debugging¶
Poller is not picking up new devices¶
- Check
kubectl logs deployment/dashboard | grep '\[dut-api/poller\]'for the startup line - Verify
DUT_API_POLLER_DISABLEis not set to1 - Verify
DUT_CRED_ENC_KEYis provisioned (see one-time setup above)
Device shows last_poll_status: auth_failed¶
- Test the connection manually with the
/testendpoint - Verify the device's API user has the right permissions (FTD: API user role; Nexus: NX-API enabled in the user's profile; Fortinet: API admin profile)
- Check device-side logs (e.g. FTD
/var/log/audit.log)
Device shows last_poll_status: tls_error¶
- Set
tlsVerifyMode: 'self-signed'for devices with self-signed certs (defaultstrictrejects them) - For self-signed, the cert SHA-256 is pinned on first connect; if the cert rotates, TLS errors return — operator must clear
pinned_cert_sha256to accept the new cert
Snapshots stop arriving but no error logged¶
- Check Postgres advisory lock contention: if two dashboard pods are racing, only one polls. Confirm replicas=1 OR accept that snapshots come at twice the configured interval (each pod gets the lock half the time).
- Check disk space on the Postgres volume —
dut_api_snapshotsgrows append-only.
Security model¶
- Credentials at rest: AES-256-GCM in Postgres BYTEA. Key is
DUT_CRED_ENC_KEYenv var (32 bytes), stored in K8s Secret. - Credentials in transit (dashboard → device): HTTPS.
tlsVerifyModeis per-device (strict / self-signed / insecure). - Credentials in memory: only during one polling cycle; cleared after
disconnect(). - API endpoints require admin auth (same as the rest of the admin UI).
- The Test Connection / Manual Snapshot endpoints are write-side actions — admin auth re-checked per call.
Performance¶
The default cadence (1 device every 5 min) generates roughly:
- 7 snapshots per FTD per poll → 84 snapshots/hour/device
- 6 snapshots per Nexus per poll → 72 snapshots/hour/device
- 8 snapshots per UCS per poll → 96 snapshots/hour/device
For a typical 4-device lab (1 NGFW + 1 switch + 2 UCS), expect ~350 snapshots/hour, ~8400/day, ~250k/month. At ~250 bytes per snapshot (with payload), this is ~63 MB/month — trivial for Postgres.
For aggressive 30 s cadence on the same lab: ~3.5k snapshots/hour, ~84k/day, ~2.5M/month, ~625 MB/month — still fine but worth a dut_api_snapshots retention policy.
Retention¶
There is no automatic retention today. Operators wanting to prune old snapshots:
DELETE FROM dut_api_snapshots
WHERE collected_at < now() - interval '30 days';
Run this as a Postgres CronJob if your lab generates high volume.
Future: a configurable dut_api_snapshot_retention_days field per device + automatic prune (PR-D scope).
Related¶
DUT_API_INTEGRATION.md— architectureAPI_FEATURE_CATALOG.md— what features the API unlocks (45 cataloged)SYSLOG_OPERATIONS.md— second-pillar operations guideUSAGE_POLICY.md— license restrictions apply