Skip to content

ADR 0009 — L2 BPDU isolation: bidirectional Spanning-Tree filter at every test-bench boundary

  • Status: Accepted
  • Date: 2026-05-07
  • Deciders: TLSStress.Art project
  • Targets: v4.3.1 (hotfix, prerequisite for v4.4 Branch Office Simulation)

Context

The TLSStress.Art test bench operates at L2 in several places. Without explicit configuration, Linux bridges, the VyOS pod, the Nexus 9000 trunk, and the NGFW under test all participate in or generate Spanning-Tree Protocol (STP) traffic. This creates two production-grade risks every time the lab is cabled to a customer-adjacent switch:

  1. STP topology corruption of the customer network. Our internal switches and bridges generate BPDUs by default. A BPDU leaking onto a customer uplink can cause the customer's switches to elect our lab as a candidate root bridge, triggering an STP reconvergence event (~30s blackhole + a suboptimal post-convergence topology). This is a customer-visible outage caused by the lab.
  2. Internal L2 loop. A redundant path inside the test bench (e.g. two macvlan parents on the same physical interface) without STP becomes an infinite broadcast loop. Bridge CPU saturates, the lab dies, and the diagnosis is painful.

The threat is bidirectional:

  • BPDUs can ENTER the lab from the customer-facing uplink (incoming threat)
  • BPDUs ARE GENERATED inside the lab by our own equipment (outgoing threat):
  • Nexus 9000 generates BPDUs every 2s (Hello timer) on every STP-enabled port — that is the switch's job
  • NGFW (Cisco FTD) in transparent mode forwards BPDUs end-to-end by default; in routed mode it does not, but HA failover heartbeats are sometimes confused with BPDUs by client-side IDS
  • Linux bridges with stp_state=1 participate in STP

Annex G ("air-gap attestation") of the run report currently attests 4 layers of isolation (physical, BGP, NetworkPolicy, DNS NXDOMAIN). It does not attest the L2 boundary.

Decision

Implement bidirectional Spanning-Tree filtering at every test-bench boundary, in 3 layers, and extend Annex G to a 5th layer that attests this isolation per run.

Layer 1 — Linux bridges (Multus / macvlan parents)

For every Linux bridge that touches the test-bench data path:

# Disable STP on the bridge itself — we do not participate.
ip link set <bridge> type bridge stp_state 0

# For each port, enable BPDU guard (drop incoming) and root_block.
for port in $(bridge link show master <bridge> | awk '{print $2}'); do
  bridge link set dev "$port" bpdu_guard on
  bridge link set dev "$port" root_block on
done

A privileged DaemonSet runs at node-bring-up and applies these to every relevant bridge. The DaemonSet is idempotent and re-runs on bridge creation events.

Layer 2 — Nexus 9000 trunk

Two port classes, two configurations:

Lab-internal ports (Nexus → Linux bridges, macvlan parents): we want to detect inconsistencies if our internal config drifts (e.g. someone re-enables STP on a Linux bridge).

interface Ethernet1/X
  spanning-tree port type edge          ; portfast — no BPDUs from edge
  spanning-tree bpduguard enable        ; receive BPDU → err-disable us
                                        ; (canary for our own misconfig)
  storm-control broadcast level 1.0
  storm-control multicast level 1.0
  storm-control unicast level 1.0       ; all 3 storm types

Customer-facing uplink ports (Nexus → customer's switch): we MUST NOT generate BPDUs that pollute their topology. We MUST NOT propagate inbound BPDUs into our lab.

interface Ethernet1/49
  spanning-tree bpdufilter enable       ; BIDIRECTIONAL silent block
  spanning-tree bpduguard enable        ; AND if a BPDU somehow arrives,
                                        ; err-disable (rare combo, valid here)
  storm-control broadcast level 1.0
  storm-control multicast level 1.0
  storm-control unicast level 1.0

The distinction matters: bpdufilter silently drops in both directions (stops Nexus from sending Hello and ignores incoming BPDUs); bpduguard err-disables on receipt but does not stop the Nexus from sending. The boundary uplink wants both.

Layer 3 — VyOS pod

set interfaces bridge br0 stp false
set interfaces bridge br0 member interface eth1 bpdu-guard
set interfaces bridge br0 member interface eth1 root-guard

Apply to every VyOS bridge (the main br0 and any per-country macvlan parent bridges that VyOS owns).

Layer 4 — NGFW (Cisco FTD)

Routed mode (current deployment): no action required, BPDUs are not forwarded by routed-mode interfaces.

Transparent mode (not currently used, future-proofed): explicit ethertype deny:

firewall transparent
no ethertype 0x4242 permit          ; IEEE 802.1D BPDU
no ethertype 0x4243 permit          ; PVST+ variant

This decision is documented for the v4.5 work where transparent-mode tests might be introduced.

Annex G — extended to 5 layers

The 5th layer: L2 BPDU isolation. Verified per run by:

report.airgap.layer5_l2_bpdu:
  bridges_audited[]:
    - name: br0
      stp_state: 0
      ports[]:
        - name: eth1
          bpdu_guard: on
          root_block: on
  nexus_ports_audited[]:
    - name: Ethernet1/1
      port_type: edge
      bpdu_guard: enable
      bpdu_filter: enable             # uplink-class only
  bpdu_capture_60s:
    bpdu_packets_observed: 0          # MUST be 0
    capture_interface: any
    capture_filter: "ether proto 0x0026 or stp"
  all_layer5_passed: bool

A 60-second tcpdump capture is the smoke-gun: zero BPDU packets observed in 60s of normal lab activity = isolation works. Any non-zero count blocks the test run from starting (gating in production, observational warning in development environments).

Consequences

Positive

  • Customer outages caused by the lab become impossible at the L2 boundary. The lab can be safely cabled to any customer network.
  • Internal L2 loops become impossible because no bridge participates in STP and broadcasts are storm-controlled.
  • Annex G grows from 4 to 5 layers of cryptographic attestation per run. This is a quantifiable improvement to the report's defensibility.
  • The 60-second BPDU capture is reproducible evidence the lab can show to compliance auditors or to a customer who asks "how do you know you didn't affect my network?".

Negative

  • Every Nexus port re-classification is manual upfront work. Operators must label which physical ports are lab-internal vs customer-uplink and apply the right interface block. Misclassifying a customer-facing port as lab-internal would re-introduce the leak.
  • The DaemonSet for Linux bridges adds privileged container surface. We mitigate by scoping securityContext.capabilities.add: ["NET_ADMIN"] to only this DaemonSet and gating its image with the same supply-chain policy as the rest of the stack.
  • The 60-second BPDU capture adds ~1 minute to test bring-up. Acceptable trade-off given the scenarios it prevents.

Neutral

  • The methodology is vendor-agnostic for the Nexus side (any Cisco NX-OS speaks spanning-tree port type edge, bpduguard, bpdufilter). Adapting to Arista or Juniper switches is mechanical.

Operator-facing rules

  1. Every new lab cabling change must be followed by scripts/airgap-l2-verify.sh, which runs the 60-second capture and prints PASS/FAIL.
  2. Run reports refuse to start in production environments if the layer-5 attestation has not passed within the last 5 minutes.
  3. In development environments (AIRGAP_GATING=observational), a layer-5 failure logs a WARNING but does not block the run.

Alternatives considered

Alternative A — Disable STP only at the Linux bridge level

Reject. Does not address the Nexus-side BPDU generation, which is guaranteed to happen every 2s on every STP-enabled port. The customer uplink would still leak.

Alternative B — Use VLAN-level filtering instead of port-level BPDU filter

Reject. VLAN filtering does not prevent the BPDU itself from being generated by the Nexus on the trunk port; it only filters tagged frames. BPDU-class frames are generated regardless of VLAN configuration.

Alternative C — Leave STP on, just monitor for unexpected BPDUs

Reject. Monitoring is reactive; the failure scenarios cause customer outages BEFORE the operator notices the alert. We need preventive isolation, not detective monitoring.

Implementation references

  • scripts/nexus/01-apply-tuning.nxos — Nexus configuration extension (PR-B)
  • k8s/dut/47-bpdu-guard-daemonset.yaml — Linux bridge BPDU guard DaemonSet (PR-C)
  • k8s/dut/45-vyos-isp-router.yaml — VyOS bridge config update (PR-C)
  • dashboard/templates/annex-g-airgap-attestation.md.tmpl — Layer 5 schema extension (PR-D)
  • scripts/airgap-l2-verify.sh — 60-second BPDU capture + JSON output (PR-D)
  • dashboard/src/lib/preflight/airgap-checks.ts — Layer 5 preflight integration (PR-D)
  • docs/L2_ISOLATION.md (+ .pt-BR.md, .es.md) — Operator-facing methodology document (this PR)

References

  • IEEE 802.1D — Spanning Tree Protocol
  • Cisco NX-OS Configuration Guide — spanning-tree bpdufilter, bpduguard, port type edge
  • Linux bridge man page (man 8 bridge) — bpdu_guard, root_block, stp_state
  • VyOS handbook — Bridge interfaces with STP/RSTP and BPDU guard
  • ADR 0007 (Public-Internet Realism) — air-gap layers 1-4 baseline that this ADR extends