NGFW DUT Test — End-to-End Quickstart Checklist¶

Who this is for: anyone setting up the TLSStress.Art to run a TLS inspection performance test on a physical NGFW for the first time. No prior Kubernetes experience required — every command is ready to copy.

Scope status (post-Scope-Freeze 2026-05-10) — See ARCHITECTURE.md for the canonical 37 MÓDULOs + 7 Test Kinds + DOM/CPOS/PIE-PA safety architecture. ADRs 0014, 0019-0025 cover post-Freeze additions.

Time to complete: ~2 hours (hardware cabling + NGFW config + software install)

Author: André Luiz Gallon — agallon@Cisco.com | Version: v3.6.0

Before You Start — What You Need¶

Hardware¶

1 Ubuntu server (test-bed host) — Ubuntu 22.04 LTS, minimum 8 cores / 32 GB RAM / 100 GB disk
1 Cisco Nexus 9000 (or any managed switch with 802.1q VLAN trunking)
1 NGFW (the device being tested — Cisco FTD, FortiGate, Palo Alto, Check Point, etc.)
Network cables: 2–3 cables between Ubuntu ↔ Nexus; 2 cables between Nexus ↔ NGFW

Software (on the Ubuntu server)¶

Ubuntu 22.04 LTS installed and SSH accessible
sudo access on the Ubuntu server
git installed: sudo apt-get install -y git

Knowledge required¶

You know how to SSH into a Linux server
You know how to log in to your NGFW web GUI or CLI
You know your NGFW's intercept CA (the CA it uses to resign HTTPS certificates for TLS inspection)

Phase 1 — Cable the Hardware¶

Step 1 — Connect Ubuntu server to the Nexus 9000¶

Connect eth1 on the Ubuntu server to a trunk port on the Nexus 9000.

Ubuntu eth1 ──── (trunk, all VLANs) ──── Nexus 9000

eth0 on Ubuntu is for SSH and management — do NOT use it for test traffic.

Step 2 — Connect NGFW to the Nexus 9000¶

Connect two cables: one for the NGFW inside interface (agent side) and one for the outside interface (webserver side).

Nexus 9000 ──── (trunk VLANs 20,30) ──── NGFW inside
Nexus 9000 ──── (trunk VLANs 101-120) ── NGFW outside

Optionally connect a third cable for SNMP management (VLAN 99).

Phase 2 — Configure the Nexus 9000¶

Step 3 — Configure VLANs and trunk ports on the Nexus 9000¶

SSH or console into the Nexus 9000 and run:

configure terminal

! Create all VLANs
vlan 20,30,99,101-120
  name test-vlans
exit

! Trunk toward Ubuntu server
interface <interface-toward-ubuntu>
  switchport mode trunk
  switchport trunk allowed vlan 20,30,99,101-120
  no shutdown
exit

! Trunk toward NGFW inside (agents)
interface <interface-toward-ngfw-inside>
  switchport mode trunk
  switchport trunk allowed vlan 20,30
  no shutdown
exit

! Trunk toward NGFW outside (webservers)
interface <interface-toward-ngfw-outside>
  switchport mode trunk
  switchport trunk allowed vlan 101-120
  no shutdown
exit

! Optional: access port for NGFW SNMP management
interface <interface-toward-ngfw-mgmt>
  switchport mode access
  switchport access vlan 99
  no shutdown
exit

copy running-config startup-config

Replace <interface-toward-...> with your actual interface names (e.g., Ethernet1/1, Ethernet1/2, Ethernet1/3).

Expected result: show vlan brief shows VLANs 20, 30, 99, 101–120 as active.

Phase 3 — Install the Software on Ubuntu¶

Step 4 — Clone the repository¶

git clone https://github.com/nollagluiz/AI_forSE.git
cd AI_forSE

Step 5 — Run the automated installer¶

sudo bash scripts/k8s-install.sh --mode=single --data-iface=eth1

This installs k3s, Helm, cert-manager, Multus CNI, configures 802.1q VLANs on eth1, and deploys the full web agent cluster. Takes about 15–20 minutes.

Have more than one UCS available? This checklist walks through single-node (one Ubuntu host runs everything). Three other deployment modes exist: - Dual-node (2 UCS) — UCS-1 runs the agent fleet, UCS-2 runs personas + services. See UBUNTU_K3S_DUALNODE_QUICKSTART_DEPLOY.en.md. - Tri-node (3 UCS) — UCS-1 = browser engine, UCS-2 = synthetic-load engine, UCS-3 = personas + services. Adds runtime isolation between the two agent runtimes. See UBUNTU_K3S_TRINODE_QUICKSTART_DEPLOY.en.md. - Multi-node (4 UCS) — one role per UCS for maximum throughput. See UBUNTU_K3S_MULTINODE_QUICKSTART_DEPLOY.en.md.

All alternatives use k8s-install.sh with different --mode flags. The remainder of this checklist (NGFW PKI, verification, test execution) applies to all four modes — only Phase 3 differs.

Expected result at the end of the script:

[OK] k3s cluster is running
[OK] cert-manager is ready
[OK] Multus is ready
[OK] TLSStress.Art deployed
[OK] All pods are Running

Step 6 — Verify all pods are running¶

kubectl get pods -n web-agents

All pods should show Running or Completed. If any pod is Pending or Error, wait 2 more minutes and try again.

kubectl get pods -n web-agents --watch
# Press Ctrl+C when all pods show Running

Step 7 — Activate DUT mode (places NGFW in the data path)¶

sudo bash scripts/k8s-dut-up.sh up

Expected result:

[OK] DUT overlay applied
[OK] PKI ready (persona-ca-issuer active, persona-ca-bundle secret available)
[OK] Caddy pods running with macvlan net1
[OK] DUT mode active

Phase 4 — PKI Exchange (the most important step)¶

This phase sets up mutual certificate trust between the cluster and the NGFW. Do not skip this.

Step 8 — Export persona-ca from the cluster → import into the NGFW¶

This CA signs all persona webserver certificates (both Synthetic Personas VLANs 101–120 and Cloned Persona slots VLANs 200–209). The NGFW must trust it to validate TLS Leg 2 connections.

8a. Export the CA from the cluster:

kubectl get secret persona-ca-bundle -n web-agents \
  -o jsonpath='{.data.ca\.crt}' | base64 -d > /tmp/persona-ca.pem

# Confirm it looks like a valid certificate
openssl x509 -in /tmp/persona-ca.pem -noout -subject -dates

Expected output:

subject=O=Web Agent Cluster, OU=Persona PKI, CN=Persona CA
notBefore=<date>
notAfter=<date 10 years later>

8b. Copy the file to your laptop:

# From your laptop (not from the server):
scp ubuntu@<server-ip>:/tmp/persona-ca.pem ~/Downloads/persona-ca.pem

8c. Import into the NGFW — use the steps for your vendor from docs/NGFW_CONFIGURATION_REFERENCE.en.md, section 8. The general location is:

Vendor	Where to import
Cisco FTD (FMC)	Objects → PKI → Trusted CAs → Add Trusted CA
Cisco FTD (FDM)	Objects → Certificates → Trusted CA → Add
Cisco ASA	`crypto ca authenticate Persona-CA` (paste PEM)
FortiGate	Security Profiles → SSL/SSH Inspection → CA Certificate
Palo Alto	Device → Certificate Management → Certificates → Import (mark as Trusted Root CA)
Check Point	SmartConsole → Objects → Certificate Authority → Trusted CA

Step 9 — Export ngfw-ca from the NGFW → install in the cluster¶

This is the NGFW's own intercept CA. Agents must trust it so they accept the NGFW's re-signed certificates.

9a. Export from the NGFW (see section 6.2 of the NGFW reference for your vendor). Save it as ngfw-ca.pem on your laptop.

9b. Copy to the Ubuntu server:

# From your laptop:
scp ~/Downloads/ngfw-ca.pem ubuntu@<server-ip>:/tmp/ngfw-ca.pem

9c. Install in the cluster:

kubectl create configmap ngfw-ca \
  -n web-agents \
  --from-file=ngfw-ca.crt=/tmp/ngfw-ca.pem \
  --dry-run=client -o yaml | kubectl apply -f -

9d. Restart the agents to pick up the new CA:

kubectl rollout restart deployment/web-agent -n web-agents
kubectl rollout restart deployment/k6-agent  -n web-agents

# Wait for restart to complete
kubectl rollout status deployment/web-agent -n web-agents
kubectl rollout status deployment/k6-agent  -n web-agents

Expected output:

deployment "web-agent" successfully rolled out
deployment "k6-agent" successfully rolled out

Phase 5 — Configure the NGFW¶

Step 10 — Configure VLAN subinterfaces on the NGFW¶

For each VLAN in the table below, create a subinterface with the listed IP address. See section 8 of docs/NGFW_CONFIGURATION_REFERENCE.en.md for vendor-specific CLI/GUI steps.

VLAN	IP to configure on NGFW	Purpose
20	`172.16.0.1/16`	browser-engine agent gateway
30	`172.17.0.1/16`	synthetic-load agent gateway
99	`192.168.90.3/24`	SNMP monitoring (optional)
101	`10.1.1.1/27`	shop persona gateway
102	`10.1.2.1/27`	news persona gateway
103	`10.1.3.1/27`	blog persona gateway
104	`10.1.4.1/27`	docs persona gateway
105	`10.1.5.1/27`	gallery persona gateway
106	`10.1.6.1/27`	stream persona gateway
107	`10.1.7.1/27`	download persona gateway
108	`10.1.8.1/27`	edu persona gateway
109	`10.1.9.1/27`	gov persona gateway
110	`10.1.10.1/27`	cdn persona gateway
111	`10.1.11.1/27`	api-rest persona gateway
112	`10.1.12.1/27`	api-graphql persona gateway
113	`10.1.13.1/27`	chat persona gateway
114	`10.1.14.1/27`	webhook persona gateway
115	`10.1.15.1/27`	telemetry persona gateway
116	`10.1.16.1/27`	ads persona gateway
117	`10.1.17.1/27`	har-saas persona gateway
118	`10.1.18.1/27`	har-social persona gateway
119	`10.1.19.1/27`	har-webmail persona gateway
120	`10.1.20.1/27`	har-media persona gateway

Step 11 — Create security zones on the NGFW¶

Zone	Assign VLANs
`agents`	VLAN 20, VLAN 30
`personas`	VLANs 101–120

Step 12 — Create the TLS inspection policy on the NGFW¶

Create a policy that decrypts and inspects all HTTPS traffic from agents to personas:

Rule	Source	Destination	Ports	Action
Decrypt all	`172.16.0.0/16`, `172.17.0.0/16`	`10.1.0.0/16`	TCP 443, UDP 443	Decrypt & Inspect

Use the persona-ca you imported in step 8 as the trusted CA for outbound connections (Leg 2)
Use your NGFW intercept CA (the one you exported in step 9) to resign certificates for agents (Leg 1)

Phase 6 — Verify Everything Works¶

Step 13 — Verify NGFW gateway is reachable from agents¶

# Get into a browser-engine agent pod
kubectl exec -it -n web-agents deployment/web-agent -- sh

# Ping the NGFW gateway
ping -c 3 172.16.0.1

Expected output:

3 packets transmitted, 3 received, 0% packet loss

If pings fail: check VLAN 20 is configured on both the Nexus trunk and the NGFW inside interface.

Step 14 — Verify TLS inspection is active¶

# Still inside the agent pod:
echo | openssl s_client -connect shop.persona.internal:443 \
  -servername shop.persona.internal 2>/dev/null \
  | openssl x509 -noout -issuer -subject

Expected (TLS inspection working):

issuer=CN=<Your NGFW intercept CA name>
subject=CN=shop.persona.internal

Problem (TLS inspection NOT working — NGFW not in path):

issuer=CN=Persona CA   (or similar persona-ca-issuer CN — not the NGFW CA)
subject=CN=shop.persona.internal

If the issuer is the Persona CA (not the NGFW's intercept CA), the NGFW is not intercepting traffic. Check: NGFW TLS policy is deployed, zones are correct, traffic is routing through the NGFW.

Problem (certificate error):

SSL certificate verify error: unable to get local issuer certificate

If you see a certificate error, the ngfw-ca is not installed or agents were not restarted. Re-run step 9.

Step 15 — Verify all 20 personas are reachable¶

# From inside the agent pod:
for persona in shop news blog docs gallery stream download edu gov cdn \
               api-rest api-graphql chat webhook telemetry ads \
               har-saas har-social har-webmail har-media; do
  code=$(curl -sk -o /dev/null -w "%{http_code}" \
    https://${persona}.persona.internal/ --max-time 3)
  echo "${persona}: HTTP ${code}"
done

Expected: all 20 personas return HTTP 200. Any 000 means the persona is unreachable (check NGFW routing for that VLAN).

Phase 7 — Run the Test and Read Results¶

Step 16 — Open the dashboard¶

The dashboard runs on port 3000 of the Ubuntu server:

http://<ubuntu-server-ip>:3000

From the dashboard you can: - See which personas are active (green = running, red = error) - Start/stop browser-engine and synthetic-load load - Adjust the number of concurrent agent sessions

Step 17 — Start the load test¶

From the dashboard, click Start Test or use the CLI:

# Start browser-engine agents (real browser simulation)
kubectl scale deployment/web-agent -n web-agents --replicas=10

# Start synthetic-load agents (HTTP load)
kubectl scale deployment/k6-agent -n web-agents --replicas=5

Increase replicas gradually to ramp up load.

Step 18 — Open Grafana to see performance metrics¶

Grafana runs on port 3001:

http://<ubuntu-server-ip>:3001

Default credentials: admin / admin (change on first login)

Key dashboards to watch:

Dashboard	What it shows	What to look for
TLS Throughput	MB/s of inspected traffic	Drops under load = NGFW bottleneck
TLS Latency	ms added per connection by inspection	Spikes = NGFW CPU struggling
HTTP/3 vs HTTP/2	QUIC vs TCP inspection comparison	QUIC drop rate = NGFW QUIC support limit
Error Rate	% of connections dropped	>1% error rate = NGFW overloaded
NGFW CPU (SNMP)	NGFW CPU utilization	Correlate CPU% with throughput drops

Step 19 — Interpret the results¶

Metric	Good result	Warning	Critical
TLS Throughput	Stable at target	Gradual decline	Sudden drop >20%
TLS Latency added	< 5ms per connection	5–20ms	> 20ms
Error rate	< 0.1%	0.1–1%	> 1%
NGFW CPU	< 70%	70–85%	> 85% (throttling)

The test results show the maximum TLS inspection capacity of the NGFW before performance degrades.

Cleanup — Stop the Test¶

Step 20 — Scale down agents¶

kubectl scale deployment/web-agent -n web-agents --replicas=0
kubectl scale deployment/k6-agent  -n web-agents --replicas=0

Step 21 — Take down DUT mode (optional)¶

sudo bash scripts/k8s-dut-up.sh down

Step 22 — Verify the cloner storage stack¶

The cloner depends on three DUT-overlay components: the CNI DHCP daemon (so VLAN 40 IPAM works), the in-cluster NFS server (so cloner writes are visible to slot pods cross-node), and 11 static PV/PVC pairs that bind every slot's cloned-sites claim to the same NFS export.

# CNI DHCP daemon — must be Ready on every node hosting a pod with IPAM=dhcp
kubectl get ds cni-dhcp-daemon -n web-agents

# NFS server — single replica, prefers role=infra (multi-node) or any node (single)
kubectl get pod -n web-agents -l app.kubernetes.io/name=nfs-server -o wide

# 11 PV/PVC pairs — 1 RWX writer + 10 ROX slot readers, all Bound
kubectl get pv | grep cloned-sites
kubectl get pvc -A | grep cloned-sites

# All NFS traffic must take OOBI — the Service has no Multus annotation
kubectl get svc nfs-server -n web-agents -o yaml | grep 'k8s.v1.cni.cncf.io' || echo "OK — OOBI only"

Step 23 — Apply host-side tuning on every node hosting persona stacks¶

REQUIRED for the Synthetic Persona and Cloned Persona Caddy webservers. Without this tuning kernel UDP buffers cap QUIC at ~30 Mbps per replica and TCP cwnd resets on every HTTP/2 idle window — neither stack will hit its target throughput.

The in-cluster node-tuning DaemonSet covers runtime; the script scripts/host-tuning.sh covers reboot persistence + CPU pinning that the DaemonSet cannot configure.

# UCS-1 (role=ngfw-dut) and UCS-4 (role=infra) — the persona-hosting nodes:
sudo scripts/host-tuning.sh apply --enable-cpu-pinning

# UCS-2 + UCS-3 (agent fleet) — sysctls only (no kubelet restart):
sudo scripts/host-tuning.sh apply

# Verify on every node (coloured report)
sudo scripts/host-tuning.sh status

--enable-cpu-pinning flips cpuManagerPolicy: static on the kubelet and restarts it — plan a maintenance window. See PERFORMANCE_TUNING_HOST.md for the values and the rationale.

Troubleshooting Quick Reference¶

Problem	First thing to check	Command
Agents can't ping NGFW (`172.16.0.1`)	VLAN 20 on Nexus trunk toward Ubuntu	`show vlan brief` on Nexus
Certificate error on agents	`ngfw-ca` not installed or agents not restarted	Re-run step 9
All personas return `000`	NGFW not routing or TLS policy blocking	Check NGFW access policy allows `172.16.0.0/16` → `10.1.0.0/16`
Grafana shows no NGFW CPU data	SNMP not configured or VLAN 99 issue	`snmpwalk -v2c -c public 192.168.90.3 1.3.6.1.2.1.1.1.0`
HTTP/3 sessions fail	NGFW doesn't support QUIC inspection	Block UDP 443 on NGFW to test HTTP/2 only
Some personas fail, others work	NGFW missing subinterface for that VLAN	Check NGFW has IP configured for the failing persona's VLAN
Pods stuck in `Pending`	Node resources exhausted	`kubectl describe pod <pod-name> -n web-agents`
Cloner pod has no `net1` IP	CNI DHCP daemon not Ready on cloner's node	`kubectl get ds cni-dhcp-daemon -n web-agents`
Slot pods stuck `MountVolume.SetUp failed`	NFS server pod not Ready or on the wrong node	`kubectl -n web-agents get pod -l app.kubernetes.io/name=nfs-server -o wide`
`cloned-sites` PVC `Pending`	Static PV not bound — `volumeName` mismatch	`kubectl describe pv cloned-sites-writer` and the slot PVs
Slots Ready but serve `404`	Cloner has not finished a job for that `personaName` yet	Check `clone_jobs` table

Full NGFW configuration reference (all VLANs, IPs, vendor examples): docs/NGFW_CONFIGURATION_REFERENCE.en.md Automated installer details: scripts/k8s-install.sh Architecture overview: docs/ARCHITECTURE.md