Skip to content

DUT API operations — registering devices and the polling worker

Read in your language: English · Português · Español

Scope status (post-Scope-Freeze 2026-05-10) — See ARCHITECTURE.md for the canonical 37 MÓDULOs + 7 Test Kinds + DOM/CPOS/PIE-PA safety architecture. ADRs 0014, 0019-0025 cover post-Freeze additions. This is the operator-facing guide for the DUT API integration. Read DUT_API_INTEGRATION.md first for the architecture; read API_FEATURE_CATALOG.md for what features the API integration unlocks. This document tells you how to actually USE it: how to register a device, how the polling worker behaves, how to trigger a manual snapshot, and how to debug.

Polling worker — what it does

Once a device is registered, a polling worker inside the dashboard pod automatically queries it on a schedule. The worker:

  1. Runs every 30 seconds (scan interval — does not mean every device is polled every 30s)
  2. Selects devices where last_poll_at + poll_interval_seconds < now() OR last_poll_at IS NULL
  3. For each due device:
  4. Decrypts the password using DUT_CRED_ENC_KEY
  5. Instantiates the appropriate adapter (Cisco FTD, Cisco Nexus, Cisco UCS, Fortinet)
  6. Calls the adapter's collectAll*() method
  7. Persists each snapshot to dut_api_snapshots with SHA-256
  8. Updates the device row's last_poll_at, last_poll_status, last_poll_error
  9. Multi-replica safe via Postgres advisory lock: if two dashboard pods are running, only one polls at a time

Disabling the worker

# Set the env var in the dashboard's K8s Deployment:
DUT_API_POLLER_DISABLE=1

The worker logs a one-shot disable notice at startup and returns. Manual snapshots via the API endpoint still work.

Default polling interval

poll_interval_seconds defaults to 300 (5 min) at registration. Per-device override is allowed (minimum 30 s). Recommended values:

Use case Interval
Background inventory refresh 300 s (default)
Active engagement, "DUT Live State" dashboard 60 s
Forensic baseline runs (long-running SOAK) 30 s
Pre-flight check moment (one-shot, not periodic) use POST /snapshot endpoint instead

One-time setup — encryption key

Credentials are stored AES-256-GCM-encrypted with DUT_CRED_ENC_KEY (env var). Before registering the first device, provision the key:

# 1. Generate the key
KEY=$(node -e "console.log(require('crypto').randomBytes(32).toString('hex'))")

# 2. Provision via K8s Secret
kubectl create secret generic tlsstress-dut-cred \
  --from-literal=DUT_CRED_ENC_KEY="$KEY" \
  --namespace web-agents

# 3. Wire to the dashboard Deployment env
kubectl patch deployment dashboard --namespace web-agents \
  --patch='{"spec":{"template":{"spec":{"containers":[{"name":"dashboard","env":[{"name":"DUT_CRED_ENC_KEY","valueFrom":{"secretKeyRef":{"name":"tlsstress-dut-cred","key":"DUT_CRED_ENC_KEY"}}}]}]}}}}'

# 4. Restart the dashboard to pick up the env var
kubectl rollout restart deployment dashboard --namespace web-agents

API — registering devices

All endpoints require admin auth (the same auth the rest of the admin UI uses).

List vendors + devices

GET /api/admin/dut/devices

Response:

{
  "devices": [
    { "id": "...", "hostname": "ftd-1.lab", "vendor": "cisco-ftd", "lastPollStatus": "ok", ... }
  ],
  "vendors": [
    { "key": "cisco-ftd",          "displayName": "Cisco FTD (FDM API)",          "available": true },
    { "key": "cisco-nexus",        "displayName": "Cisco Nexus 9000 (NX-API DME)", "available": true },
    { "key": "cisco-ucs-cimc",     "displayName": "Cisco UCS C-Series CIMC",       "available": true },
    { "key": "fortinet-fortigate", "displayName": "Fortinet FortiGate (REST v2)",  "available": true },
    { "key": "palo-alto-panos",    "displayName": "Palo Alto (not yet implemented)","available": false }
  ]
}

Register a Cisco FTD

curl -X POST "https://dashboard.example/api/admin/dut/devices" \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "hostname": "ftd-1.lab.example.com",
    "vendor": "cisco-ftd",
    "deviceRole": "ngfw",
    "baseUrl": "https://10.10.10.1",
    "username": "admin",
    "password": "<the-ftd-password>",
    "tlsVerifyMode": "self-signed",
    "pollIntervalSeconds": 300,
    "notes": "Cisco FTD 7.4 on FPR1010 — primary in HA pair"
  }'
# {"ok": true, "deviceId": "..."}

Register a Fortinet FortiGate (with API token)

curl -X POST "https://dashboard.example/api/admin/dut/devices" \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -d '{
    "hostname": "fortigate-1.lab.example.com",
    "vendor": "fortinet-fortigate",
    "deviceRole": "ngfw",
    "baseUrl": "https://10.10.10.5",
    "username": "apitoken",
    "password": "<the-api-token>",
    "tlsVerifyMode": "strict",
    "pollIntervalSeconds": 300
  }'

Note: setting username to the literal apitoken signals the adapter to use API-token auth (Bearer); otherwise it falls back to cookie-based session auth.

Register a Cisco Nexus 9000

# Pre-requisite on the switch:
#   configure terminal
#   feature nxapi
#   nxapi https port 443
#   nxapi use-vrf management
#   end

curl -X POST "https://dashboard.example/api/admin/dut/devices" \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -d '{
    "hostname": "nexus-1.lab.example.com",
    "vendor": "cisco-nexus",
    "deviceRole": "switch",
    "baseUrl": "https://10.10.10.10",
    "username": "admin",
    "password": "<password>",
    "tlsVerifyMode": "self-signed",
    "pollIntervalSeconds": 300
  }'

Test the connection

curl -X POST "https://dashboard.example/api/admin/dut/devices/<id>/test" \
  -H "Authorization: Bearer $ADMIN_TOKEN"
# {"ok": true, "detail": "auth OK; CPU monitor reachable in 142ms", "latencyMs": 142}

If ok: false, the detail field tells you what failed (auth, TLS, timeout, HTTP error).

Trigger a manual snapshot

For ad-hoc inspection or right before a critical run:

curl -X POST "https://dashboard.example/api/admin/dut/devices/<id>/snapshot" \
  -H "Authorization: Bearer $ADMIN_TOKEN"
# {
#   "ok": true,
#   "lastPollStatus": "ok",
#   "lastPollError": "collected 7 snapshot(s)",
#   "lastPollAt": "2026-05-06T14:35:00.000Z"
# }

This calls pollOneDevice() synchronously; the response confirms how many snapshots were collected.

Browse snapshots

# Latest 100 snapshots across all devices
curl "https://dashboard.example/api/admin/dut/snapshots" \
  -H "Authorization: Bearer $ADMIN_TOKEN"

# Filter by device + endpoint label
curl "https://dashboard.example/api/admin/dut/snapshots?deviceId=...&endpointLabel=decrypt_policy" \
  -H "Authorization: Bearer $ADMIN_TOKEN"

# Snapshots for a specific test run
curl "https://dashboard.example/api/admin/dut/snapshots?testRunExecutionId=<uuid>" \
  -H "Authorization: Bearer $ADMIN_TOKEN"

The response includes payloadSha256 for each — the hash anchors the Test Run Report annexes.

Update a device

# Disable polling temporarily
curl -X PATCH "https://dashboard.example/api/admin/dut/devices/<id>" \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -d '{"enabled": false}'

# Rotate the password
curl -X PATCH "https://dashboard.example/api/admin/dut/devices/<id>" \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -d '{"password": "<new-password>"}'

# Change polling cadence
curl -X PATCH "https://dashboard.example/api/admin/dut/devices/<id>" \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -d '{"pollIntervalSeconds": 60}'

Delete a device

curl -X DELETE "https://dashboard.example/api/admin/dut/devices/<id>" \
  -H "Authorization: Bearer $ADMIN_TOKEN"

The cascade also deletes all of the device's historical snapshots (FK ON DELETE CASCADE in the migration).

Debugging

Poller is not picking up new devices

  • Check kubectl logs deployment/dashboard | grep '\[dut-api/poller\]' for the startup line
  • Verify DUT_API_POLLER_DISABLE is not set to 1
  • Verify DUT_CRED_ENC_KEY is provisioned (see one-time setup above)

Device shows last_poll_status: auth_failed

  • Test the connection manually with the /test endpoint
  • Verify the device's API user has the right permissions (FTD: API user role; Nexus: NX-API enabled in the user's profile; Fortinet: API admin profile)
  • Check device-side logs (e.g. FTD /var/log/audit.log)

Device shows last_poll_status: tls_error

  • Set tlsVerifyMode: 'self-signed' for devices with self-signed certs (default strict rejects them)
  • For self-signed, the cert SHA-256 is pinned on first connect; if the cert rotates, TLS errors return — operator must clear pinned_cert_sha256 to accept the new cert

Snapshots stop arriving but no error logged

  • Check Postgres advisory lock contention: if two dashboard pods are racing, only one polls. Confirm replicas=1 OR accept that snapshots come at twice the configured interval (each pod gets the lock half the time).
  • Check disk space on the Postgres volume — dut_api_snapshots grows append-only.

Security model

  • Credentials at rest: AES-256-GCM in Postgres BYTEA. Key is DUT_CRED_ENC_KEY env var (32 bytes), stored in K8s Secret.
  • Credentials in transit (dashboard → device): HTTPS. tlsVerifyMode is per-device (strict / self-signed / insecure).
  • Credentials in memory: only during one polling cycle; cleared after disconnect().
  • API endpoints require admin auth (same as the rest of the admin UI).
  • The Test Connection / Manual Snapshot endpoints are write-side actions — admin auth re-checked per call.

Performance

The default cadence (1 device every 5 min) generates roughly:

  • 7 snapshots per FTD per poll → 84 snapshots/hour/device
  • 6 snapshots per Nexus per poll → 72 snapshots/hour/device
  • 8 snapshots per UCS per poll → 96 snapshots/hour/device

For a typical 4-device lab (1 NGFW + 1 switch + 2 UCS), expect ~350 snapshots/hour, ~8400/day, ~250k/month. At ~250 bytes per snapshot (with payload), this is ~63 MB/month — trivial for Postgres.

For aggressive 30 s cadence on the same lab: ~3.5k snapshots/hour, ~84k/day, ~2.5M/month, ~625 MB/month — still fine but worth a dut_api_snapshots retention policy.

Retention

There is no automatic retention today. Operators wanting to prune old snapshots:

DELETE FROM dut_api_snapshots
WHERE collected_at < now() - interval '30 days';

Run this as a Postgres CronJob if your lab generates high volume.

Future: a configurable dut_api_snapshot_retention_days field per device + automatic prune (PR-D scope).