NGFW DUT Test — End-to-End Quickstart Checklist¶
Who this is for: anyone setting up the TLSStress.Art to run a TLS inspection performance test on a physical NGFW for the first time. No prior Kubernetes experience required — every command is ready to copy.
Scope status (post-Scope-Freeze 2026-05-10) — See ARCHITECTURE.md for the canonical 37 MÓDULOs + 7 Test Kinds + DOM/CPOS/PIE-PA safety architecture. ADRs 0014, 0019-0025 cover post-Freeze additions.
Time to complete: ~2 hours (hardware cabling + NGFW config + software install)
Author: André Luiz Gallon — agallon@Cisco.com | Version: v3.6.0
Before You Start — What You Need¶
Hardware¶
- 1 Ubuntu server (test-bed host) — Ubuntu 22.04 LTS, minimum 8 cores / 32 GB RAM / 100 GB disk
- 1 Cisco Nexus 9000 (or any managed switch with 802.1q VLAN trunking)
- 1 NGFW (the device being tested — Cisco FTD, FortiGate, Palo Alto, Check Point, etc.)
- Network cables: 2–3 cables between Ubuntu ↔ Nexus; 2 cables between Nexus ↔ NGFW
Software (on the Ubuntu server)¶
- Ubuntu 22.04 LTS installed and SSH accessible
-
sudoaccess on the Ubuntu server -
gitinstalled:sudo apt-get install -y git
Knowledge required¶
- You know how to SSH into a Linux server
- You know how to log in to your NGFW web GUI or CLI
- You know your NGFW's intercept CA (the CA it uses to resign HTTPS certificates for TLS inspection)
Phase 1 — Cable the Hardware¶
Step 1 — Connect Ubuntu server to the Nexus 9000¶
Connect eth1 on the Ubuntu server to a trunk port on the Nexus 9000.
Ubuntu eth1 ──── (trunk, all VLANs) ──── Nexus 9000
eth0on Ubuntu is for SSH and management — do NOT use it for test traffic.
Step 2 — Connect NGFW to the Nexus 9000¶
Connect two cables: one for the NGFW inside interface (agent side) and one for the outside interface (webserver side).
Nexus 9000 ──── (trunk VLANs 20,30) ──── NGFW inside
Nexus 9000 ──── (trunk VLANs 101-120) ── NGFW outside
Optionally connect a third cable for SNMP management (VLAN 99).
Phase 2 — Configure the Nexus 9000¶
Step 3 — Configure VLANs and trunk ports on the Nexus 9000¶
SSH or console into the Nexus 9000 and run:
configure terminal
! Create all VLANs
vlan 20,30,99,101-120
name test-vlans
exit
! Trunk toward Ubuntu server
interface <interface-toward-ubuntu>
switchport mode trunk
switchport trunk allowed vlan 20,30,99,101-120
no shutdown
exit
! Trunk toward NGFW inside (agents)
interface <interface-toward-ngfw-inside>
switchport mode trunk
switchport trunk allowed vlan 20,30
no shutdown
exit
! Trunk toward NGFW outside (webservers)
interface <interface-toward-ngfw-outside>
switchport mode trunk
switchport trunk allowed vlan 101-120
no shutdown
exit
! Optional: access port for NGFW SNMP management
interface <interface-toward-ngfw-mgmt>
switchport mode access
switchport access vlan 99
no shutdown
exit
copy running-config startup-config
Replace
<interface-toward-...>with your actual interface names (e.g.,Ethernet1/1,Ethernet1/2,Ethernet1/3).
Expected result: show vlan brief shows VLANs 20, 30, 99, 101–120 as active.
Phase 3 — Install the Software on Ubuntu¶
Step 4 — Clone the repository¶
git clone https://github.com/nollagluiz/AI_forSE.git
cd AI_forSE
Step 5 — Run the automated installer¶
sudo bash scripts/k8s-install.sh --mode=single --data-iface=eth1
This installs k3s, Helm, cert-manager, Multus CNI, configures 802.1q VLANs on
eth1, and deploys the full web agent cluster. Takes about 15–20 minutes.Have more than one UCS available? This checklist walks through single-node (one Ubuntu host runs everything). Three other deployment modes exist: - Dual-node (2 UCS) — UCS-1 runs the agent fleet, UCS-2 runs personas + services. See
UBUNTU_K3S_DUALNODE_QUICKSTART_DEPLOY.en.md. - Tri-node (3 UCS) — UCS-1 = browser engine, UCS-2 = synthetic-load engine, UCS-3 = personas + services. Adds runtime isolation between the two agent runtimes. SeeUBUNTU_K3S_TRINODE_QUICKSTART_DEPLOY.en.md. - Multi-node (4 UCS) — one role per UCS for maximum throughput. SeeUBUNTU_K3S_MULTINODE_QUICKSTART_DEPLOY.en.md.All alternatives use
k8s-install.shwith different--modeflags. The remainder of this checklist (NGFW PKI, verification, test execution) applies to all four modes — only Phase 3 differs.
Expected result at the end of the script:
[OK] k3s cluster is running
[OK] cert-manager is ready
[OK] Multus is ready
[OK] TLSStress.Art deployed
[OK] All pods are Running
Step 6 — Verify all pods are running¶
kubectl get pods -n web-agents
All pods should show Running or Completed. If any pod is Pending or Error, wait 2 more minutes and try again.
kubectl get pods -n web-agents --watch
# Press Ctrl+C when all pods show Running
Step 7 — Activate DUT mode (places NGFW in the data path)¶
sudo bash scripts/k8s-dut-up.sh up
Expected result:
[OK] DUT overlay applied
[OK] PKI ready (persona-ca-issuer active, persona-ca-bundle secret available)
[OK] Caddy pods running with macvlan net1
[OK] DUT mode active
Phase 4 — PKI Exchange (the most important step)¶
This phase sets up mutual certificate trust between the cluster and the NGFW. Do not skip this.
Step 8 — Export persona-ca from the cluster → import into the NGFW¶
This CA signs all persona webserver certificates (both Synthetic Personas VLANs 101–120 and Cloned Persona slots VLANs 200–209). The NGFW must trust it to validate TLS Leg 2 connections.
8a. Export the CA from the cluster:
kubectl get secret persona-ca-bundle -n web-agents \
-o jsonpath='{.data.ca\.crt}' | base64 -d > /tmp/persona-ca.pem
# Confirm it looks like a valid certificate
openssl x509 -in /tmp/persona-ca.pem -noout -subject -dates
Expected output:
subject=O=Web Agent Cluster, OU=Persona PKI, CN=Persona CA
notBefore=<date>
notAfter=<date 10 years later>
8b. Copy the file to your laptop:
# From your laptop (not from the server):
scp ubuntu@<server-ip>:/tmp/persona-ca.pem ~/Downloads/persona-ca.pem
8c. Import into the NGFW — use the steps for your vendor from docs/NGFW_CONFIGURATION_REFERENCE.en.md, section 8. The general location is:
| Vendor | Where to import |
|---|---|
| Cisco FTD (FMC) | Objects → PKI → Trusted CAs → Add Trusted CA |
| Cisco FTD (FDM) | Objects → Certificates → Trusted CA → Add |
| Cisco ASA | crypto ca authenticate Persona-CA (paste PEM) |
| FortiGate | Security Profiles → SSL/SSH Inspection → CA Certificate |
| Palo Alto | Device → Certificate Management → Certificates → Import (mark as Trusted Root CA) |
| Check Point | SmartConsole → Objects → Certificate Authority → Trusted CA |
Step 9 — Export ngfw-ca from the NGFW → install in the cluster¶
This is the NGFW's own intercept CA. Agents must trust it so they accept the NGFW's re-signed certificates.
9a. Export from the NGFW (see section 6.2 of the NGFW reference for your vendor). Save it as ngfw-ca.pem on your laptop.
9b. Copy to the Ubuntu server:
# From your laptop:
scp ~/Downloads/ngfw-ca.pem ubuntu@<server-ip>:/tmp/ngfw-ca.pem
9c. Install in the cluster:
kubectl create configmap ngfw-ca \
-n web-agents \
--from-file=ngfw-ca.crt=/tmp/ngfw-ca.pem \
--dry-run=client -o yaml | kubectl apply -f -
9d. Restart the agents to pick up the new CA:
kubectl rollout restart deployment/web-agent -n web-agents
kubectl rollout restart deployment/k6-agent -n web-agents
# Wait for restart to complete
kubectl rollout status deployment/web-agent -n web-agents
kubectl rollout status deployment/k6-agent -n web-agents
Expected output:
deployment "web-agent" successfully rolled out
deployment "k6-agent" successfully rolled out
Phase 5 — Configure the NGFW¶
Step 10 — Configure VLAN subinterfaces on the NGFW¶
For each VLAN in the table below, create a subinterface with the listed IP address. See section 8 of docs/NGFW_CONFIGURATION_REFERENCE.en.md for vendor-specific CLI/GUI steps.
| VLAN | IP to configure on NGFW | Purpose |
|---|---|---|
| 20 | 172.16.0.1/16 |
browser-engine agent gateway |
| 30 | 172.17.0.1/16 |
synthetic-load agent gateway |
| 99 | 192.168.90.3/24 |
SNMP monitoring (optional) |
| 101 | 10.1.1.1/27 |
shop persona gateway |
| 102 | 10.1.2.1/27 |
news persona gateway |
| 103 | 10.1.3.1/27 |
blog persona gateway |
| 104 | 10.1.4.1/27 |
docs persona gateway |
| 105 | 10.1.5.1/27 |
gallery persona gateway |
| 106 | 10.1.6.1/27 |
stream persona gateway |
| 107 | 10.1.7.1/27 |
download persona gateway |
| 108 | 10.1.8.1/27 |
edu persona gateway |
| 109 | 10.1.9.1/27 |
gov persona gateway |
| 110 | 10.1.10.1/27 |
cdn persona gateway |
| 111 | 10.1.11.1/27 |
api-rest persona gateway |
| 112 | 10.1.12.1/27 |
api-graphql persona gateway |
| 113 | 10.1.13.1/27 |
chat persona gateway |
| 114 | 10.1.14.1/27 |
webhook persona gateway |
| 115 | 10.1.15.1/27 |
telemetry persona gateway |
| 116 | 10.1.16.1/27 |
ads persona gateway |
| 117 | 10.1.17.1/27 |
har-saas persona gateway |
| 118 | 10.1.18.1/27 |
har-social persona gateway |
| 119 | 10.1.19.1/27 |
har-webmail persona gateway |
| 120 | 10.1.20.1/27 |
har-media persona gateway |
Step 11 — Create security zones on the NGFW¶
| Zone | Assign VLANs |
|---|---|
agents |
VLAN 20, VLAN 30 |
personas |
VLANs 101–120 |
Step 12 — Create the TLS inspection policy on the NGFW¶
Create a policy that decrypts and inspects all HTTPS traffic from agents to personas:
| Rule | Source | Destination | Ports | Action |
|---|---|---|---|---|
| Decrypt all | 172.16.0.0/16, 172.17.0.0/16 |
10.1.0.0/16 |
TCP 443, UDP 443 | Decrypt & Inspect |
- Use the
persona-cayou imported in step 8 as the trusted CA for outbound connections (Leg 2) - Use your NGFW intercept CA (the one you exported in step 9) to resign certificates for agents (Leg 1)
Phase 6 — Verify Everything Works¶
Step 13 — Verify NGFW gateway is reachable from agents¶
# Get into a browser-engine agent pod
kubectl exec -it -n web-agents deployment/web-agent -- sh
# Ping the NGFW gateway
ping -c 3 172.16.0.1
Expected output:
3 packets transmitted, 3 received, 0% packet loss
If pings fail: check VLAN 20 is configured on both the Nexus trunk and the NGFW inside interface.
Step 14 — Verify TLS inspection is active¶
# Still inside the agent pod:
echo | openssl s_client -connect shop.persona.internal:443 \
-servername shop.persona.internal 2>/dev/null \
| openssl x509 -noout -issuer -subject
Expected (TLS inspection working):
issuer=CN=<Your NGFW intercept CA name>
subject=CN=shop.persona.internal
Problem (TLS inspection NOT working — NGFW not in path):
issuer=CN=Persona CA (or similar persona-ca-issuer CN — not the NGFW CA)
subject=CN=shop.persona.internal
If the issuer is the Persona CA (not the NGFW's intercept CA), the NGFW is not intercepting traffic. Check: NGFW TLS policy is deployed, zones are correct, traffic is routing through the NGFW.
Problem (certificate error):
SSL certificate verify error: unable to get local issuer certificate
If you see a certificate error, the ngfw-ca is not installed or agents were not restarted. Re-run step 9.
Step 15 — Verify all 20 personas are reachable¶
# From inside the agent pod:
for persona in shop news blog docs gallery stream download edu gov cdn \
api-rest api-graphql chat webhook telemetry ads \
har-saas har-social har-webmail har-media; do
code=$(curl -sk -o /dev/null -w "%{http_code}" \
https://${persona}.persona.internal/ --max-time 3)
echo "${persona}: HTTP ${code}"
done
Expected: all 20 personas return HTTP 200. Any 000 means the persona is unreachable (check NGFW routing for that VLAN).
Phase 7 — Run the Test and Read Results¶
Step 16 — Open the dashboard¶
The dashboard runs on port 3000 of the Ubuntu server:
http://<ubuntu-server-ip>:3000
From the dashboard you can: - See which personas are active (green = running, red = error) - Start/stop browser-engine and synthetic-load load - Adjust the number of concurrent agent sessions
Step 17 — Start the load test¶
From the dashboard, click Start Test or use the CLI:
# Start browser-engine agents (real browser simulation)
kubectl scale deployment/web-agent -n web-agents --replicas=10
# Start synthetic-load agents (HTTP load)
kubectl scale deployment/k6-agent -n web-agents --replicas=5
Increase replicas gradually to ramp up load.
Step 18 — Open Grafana to see performance metrics¶
Grafana runs on port 3001:
http://<ubuntu-server-ip>:3001
Default credentials: admin / admin (change on first login)
Key dashboards to watch:
| Dashboard | What it shows | What to look for |
|---|---|---|
| TLS Throughput | MB/s of inspected traffic | Drops under load = NGFW bottleneck |
| TLS Latency | ms added per connection by inspection | Spikes = NGFW CPU struggling |
| HTTP/3 vs HTTP/2 | QUIC vs TCP inspection comparison | QUIC drop rate = NGFW QUIC support limit |
| Error Rate | % of connections dropped | >1% error rate = NGFW overloaded |
| NGFW CPU (SNMP) | NGFW CPU utilization | Correlate CPU% with throughput drops |
Step 19 — Interpret the results¶
| Metric | Good result | Warning | Critical |
|---|---|---|---|
| TLS Throughput | Stable at target | Gradual decline | Sudden drop >20% |
| TLS Latency added | < 5ms per connection | 5–20ms | > 20ms |
| Error rate | < 0.1% | 0.1–1% | > 1% |
| NGFW CPU | < 70% | 70–85% | > 85% (throttling) |
The test results show the maximum TLS inspection capacity of the NGFW before performance degrades.
Cleanup — Stop the Test¶
Step 20 — Scale down agents¶
kubectl scale deployment/web-agent -n web-agents --replicas=0
kubectl scale deployment/k6-agent -n web-agents --replicas=0
Step 21 — Take down DUT mode (optional)¶
sudo bash scripts/k8s-dut-up.sh down
Step 22 — Verify the cloner storage stack¶
The cloner depends on three DUT-overlay components: the CNI DHCP daemon (so VLAN 40 IPAM works), the in-cluster NFS server (so cloner writes are visible to slot pods cross-node), and 11 static PV/PVC pairs that bind every slot's cloned-sites claim to the same NFS export.
# CNI DHCP daemon — must be Ready on every node hosting a pod with IPAM=dhcp
kubectl get ds cni-dhcp-daemon -n web-agents
# NFS server — single replica, prefers role=infra (multi-node) or any node (single)
kubectl get pod -n web-agents -l app.kubernetes.io/name=nfs-server -o wide
# 11 PV/PVC pairs — 1 RWX writer + 10 ROX slot readers, all Bound
kubectl get pv | grep cloned-sites
kubectl get pvc -A | grep cloned-sites
# All NFS traffic must take OOBI — the Service has no Multus annotation
kubectl get svc nfs-server -n web-agents -o yaml | grep 'k8s.v1.cni.cncf.io' || echo "OK — OOBI only"
Step 23 — Apply host-side tuning on every node hosting persona stacks¶
REQUIRED for the Synthetic Persona and Cloned Persona Caddy webservers. Without this tuning kernel UDP buffers cap QUIC at ~30 Mbps per replica and TCP cwnd resets on every HTTP/2 idle window — neither stack will hit its target throughput.
The in-cluster node-tuning DaemonSet covers runtime; the script scripts/host-tuning.sh covers reboot persistence + CPU pinning that the DaemonSet cannot configure.
# UCS-1 (role=ngfw-dut) and UCS-4 (role=infra) — the persona-hosting nodes:
sudo scripts/host-tuning.sh apply --enable-cpu-pinning
# UCS-2 + UCS-3 (agent fleet) — sysctls only (no kubelet restart):
sudo scripts/host-tuning.sh apply
# Verify on every node (coloured report)
sudo scripts/host-tuning.sh status
--enable-cpu-pinning flips cpuManagerPolicy: static on the kubelet and restarts it — plan a maintenance window. See PERFORMANCE_TUNING_HOST.md for the values and the rationale.
Troubleshooting Quick Reference¶
| Problem | First thing to check | Command |
|---|---|---|
Agents can't ping NGFW (172.16.0.1) |
VLAN 20 on Nexus trunk toward Ubuntu | show vlan brief on Nexus |
| Certificate error on agents | ngfw-ca not installed or agents not restarted |
Re-run step 9 |
All personas return 000 |
NGFW not routing or TLS policy blocking | Check NGFW access policy allows 172.16.0.0/16 → 10.1.0.0/16 |
| Grafana shows no NGFW CPU data | SNMP not configured or VLAN 99 issue | snmpwalk -v2c -c public 192.168.90.3 1.3.6.1.2.1.1.1.0 |
| HTTP/3 sessions fail | NGFW doesn't support QUIC inspection | Block UDP 443 on NGFW to test HTTP/2 only |
| Some personas fail, others work | NGFW missing subinterface for that VLAN | Check NGFW has IP configured for the failing persona's VLAN |
Pods stuck in Pending |
Node resources exhausted | kubectl describe pod <pod-name> -n web-agents |
Cloner pod has no net1 IP |
CNI DHCP daemon not Ready on cloner's node | kubectl get ds cni-dhcp-daemon -n web-agents |
Slot pods stuck MountVolume.SetUp failed |
NFS server pod not Ready or on the wrong node | kubectl -n web-agents get pod -l app.kubernetes.io/name=nfs-server -o wide |
cloned-sites PVC Pending |
Static PV not bound — volumeName mismatch |
kubectl describe pv cloned-sites-writer and the slot PVs |
Slots Ready but serve 404 |
Cloner has not finished a job for that personaName yet |
Check clone_jobs table |
Full NGFW configuration reference (all VLANs, IPs, vendor examples): docs/NGFW_CONFIGURATION_REFERENCE.en.md
Automated installer details: scripts/k8s-install.sh
Architecture overview: docs/ARCHITECTURE.md