Skip to content

Air-gapped installation — TLSStress.Art

Read in your language: English · Português · Español

Audience: Operators installing TLSStress.Art in a data center where the UCS host has no internet access — typical of regulated / classified / OOB environments.

Onboarding sequence: AccessCloneInstall (air-gap path)you are here · standard alternative: Runbook First Install · maintainer setup: Private Repo Setup

Scope status (post-Scope-Freeze 2026-05-10) — Air-gap installs now leverage OBP (Operator Bridge Proxy) for one-shot internet egress via operator notebook (ADR 0022). Self-Upgrade Meraki-style channel polling supports air-gap fallback per ADR 0013. See Lab Deployment Staging primer for the 8-phase install wizard.

The problem

The standard install assumes the UCS can reach: - GitHub (to clone the source) - Docker Hub / GHCR / Quay (to pull container images) - Ubuntu apt mirrors (to install host dependencies) - get.helm.sh, k3s.io (to download Kubernetes)

In a fully air-gapped environment, none of these work. The operator must bring everything in by sneakernet.

The solution — pre-staging on a connected machine

You will use two machines:

  1. Staging machine — your laptop (or any host) with full internet access. Runs airgap-stage.sh. Produces a single ~5–10 GB tarball.
  2. Target UCS — the air-gapped host inside the data center. Runs airgap-deploy.sh after the bundle is transported in (USB drive, secure transfer, etc.).
┌──────────────────┐                ┌──────────────────────┐
│  Staging machine │                │  Air-gapped UCS      │
│  (your laptop)   │                │  (data center)       │
│                  │                │                      │
│  ./airgap-stage  │  ┌──USB──┐    │  ./airgap-deploy     │
│  →  bundle.tar.gz├─→│ 16 GB ├───→│  ←  bundle.tar.gz    │
│                  │  └───────┘    │                      │
│  full internet   │                │  zero internet       │
└──────────────────┘                └──────────────────────┘

Prerequisites

On the staging machine

  • bash 4+, curl, jq, sha256sum, tar
  • Docker (or Podman with docker-cli wrapper) — used to pull/save container images
  • Helm 3.x — for cert-manager chart download
  • A GitHub fine-grained PAT with Contents: Read on nollagluiz/AI_forSE — exported as GH_TOKEN
  • ~15 GB free disk space (working set + final tarball)
  • ~30 minutes of bandwidth-dependent download time

On the air-gapped UCS

  • Ubuntu 22.04+ Server, fresh install
  • root or sudo access
  • 32 GB RAM, 16 vCPU, 200 GB disk (same as standard install)
  • USB port or other transport for the bundle

Hardware in the data center (no change from standard runbook)

  • Cisco Nexus 9000 with management network
  • NGFW DUT (Cisco FTD/ASA/Firepower, Palo Alto, Fortinet, etc.)
  • Required VLAN trunks, IPs, and the NGFW CA cert in PEM format

Step 1 — Stage the bundle on your laptop (~30 min)

# On your connected machine:
git clone https://github.com/nollagluiz/AI_forSE.git
cd AI_forSE
export GH_TOKEN="<your-fine-grained-PAT>"

./scripts/airgap-stage.sh \
  --version v3.6.0 \
  --output /media/usb/tlsstress-airgap-v3.6.0.tar.gz

What happens — the script (in this order): 1. Downloads K3s binary + airgap images (v1.30.5+k3s1 by default) 2. Pulls 9 TLSStress.Art container images (agent, dashboard, webserver, …) from GHCR 3. Pulls 15 third-party images (Postgres, Prometheus, Grafana, Loki, etc.) 4. Downloads cert-manager Helm chart + 3 cert-manager images 5. Downloads Multus CNI manifest + image 6. Stages 11 apt .deb packages (iproute2, nftables, NFS, SNMP, etc.) using a one-shot Ubuntu 22.04 container 7. Downloads the source-code tarball at the requested version tag 8. Generates a SHA-256 manifest of every file and seals everything into one .tar.gz

Final output: - <output> — the bundle (typically 5–10 GB) - <output>.sha256 — sidecar file with the bundle's SHA-256 (for integrity verification later)

Step 2 — Transport the bundle (whatever your policy allows)

USB drive, encrypted external SSD, secure file-transfer station — whatever your data-center policy permits. Prefer USB drives that have been wiped + dedicated to this purpose to avoid cross-contamination.

If your data-center policy requires two-person integrity for media crossing the air gap: have a second engineer compute sha256sum tlsstress-airgap-v3.6.0.tar.gz on the staging laptop and on the UCS after copy. Both numbers must match.

Step 3 — Deploy on the UCS (~15 min)

Once the bundle is on the UCS (typically /media/usb/... or /tmp/):

sudo bash /tmp/scripts/airgap-deploy.sh \
  --bundle /tmp/tlsstress-airgap-v3.6.0.tar.gz

What happens — the script (in this order): 1. Verifies the bundle's SHA-256 against the .sha256 sidecar 2. Extracts to /opt/tlsstress-airgap/ 3. Verifies the internal manifest against the extracted files 4. Sets up a local apt repository at /var/local/tlsstress-apt/ from the .deb files 5. Installs host dependencies (iproute2, nftables, NFS client, SNMP, etc.) from the local repo 6. Installs K3s offline using the staged binary + airgap images 7. Pre-loads ALL container images into K3s' bundled containerd (no pull will happen later) 8. Applies the Multus CNI manifest 9. Installs cert-manager from the staged Helm chart 10. Extracts the source tarball to /home/<user>/tlsstress/AI_forSE/

After completion, you are at the same state as step 1.3 of RUNBOOK_FIRST_INSTALL.md. Continue with:

cd ~/tlsstress
kubectl apply -f k8s/
kubectl apply -k platform/
kubectl apply -k personas/
# then proceed to step 2 of the standard runbook (DUT connection)

No image pulls will be attempted — every image is already in containerd, tagged with the same name K8s manifests expect.

What the bundle contains (for audit + compliance)

The bundle SHA-256 manifest (manifest/sha256sum.txt inside the tarball) lists every file. After extraction, verify with:

cd /opt/tlsstress-airgap
sha256sum -c manifest/sha256sum.txt

A pass means every file in the bundle matches what the staging machine computed. This is your forensic chain-of-custody for the install.

The manifest/MANIFEST.txt file inside the bundle records: - Version tag installed - K3s, cert-manager, Multus versions - Total file count + size - Generation timestamp + hostname - License terms (PolyForm Noncommercial 1.0.0 + Appendix A)

Air-gapped operations after install

Once installed, the test bed itself runs without internet — it is fully self-contained: - Personas serve traffic to themselves (no external DNS, no external CDN) - Prometheus scrapes only local targets - Grafana renders only local data - The Dashboard's License Acceptance Modal fetches no external assets

What you DO NOT have in air-gap mode: - ❌ git pull for upgrades — produce a new bundle on the staging machine for each version upgrade - ❌ Live Dependabot security alerts — apply via apt-mirror snapshots periodically - ❌ External DNS for cloned personas — Cloned Personas need their target domains resolved internally; configure CoreDNS overrides or accept "best-effort" domain replay - ❌ NTP sync (often) — if the data center forbids public NTP, point K3s + the host at the lab's internal NTP server

Upgrades to a newer version

Re-run the same flow:

# On the staging laptop:
./scripts/airgap-stage.sh --version v3.7.0 --output /media/usb/tlsstress-airgap-v3.7.0.tar.gz

# On the UCS:
sudo bash scripts/airgap-deploy.sh --bundle /media/usb/tlsstress-airgap-v3.7.0.tar.gz
# the script is idempotent — pre-loaded images for v3.7.0 are added alongside v3.6.0
# After the script finishes, edit your manifests / kustomize overlays to point at the new tag

The deploy script is intentionally idempotent — running it twice is safe.

Troubleshooting

Stage failures (on your laptop)

Symptom Cause Fix
docker pull fails with 401 Not logged in to GHCR echo $GH_TOKEN \| docker login ghcr.io -u <user> --password-stdin
Helm chart pull fails Old Helm version Update to Helm 3.13+: brew install helm or your distro equivalent
Source tarball 404 PAT lacks Contents:Read on AI_forSE Regenerate fine-grained PAT with the right scope
apt-get download returns 0 packages Running on macOS without docker run ubuntu:22.04 proxy Re-run inside a Linux container as instructed in the script
Disk full Bundle ~10 GB + Docker layer cache Clear with docker system prune -a between runs

Deploy failures (on the UCS)

Symptom Cause Fix
sha256sum -c fails Bundle corrupted in transport Re-transport from the staging machine; verify SHA on both sides
K3s does not start Airgap images path wrong / version mismatch Check /var/lib/rancher/k3s/agent/images/ exists + k3s --version matches
containerd image import fails Image tarball format mismatch Re-stage with the same Docker version as the deploy host
Pods stuck ImagePullBackOff Image tag in K8s manifest does not match what's loaded Edit the manifest tag OR re-stage with the same tag
Multus DaemonSet stuck Pending Node taints from K3s init kubectl taint nodes --all node.kubernetes.io/not-ready-
cert-manager webhook timeouts DNS resolution inside cluster Cluster DNS settles after 60 s; retry; if persistent, check coredns logs

When you cannot use this method

A few scenarios need a different approach:

  • Operator forbidden from connecting laptop to internet at all — pre-staged bundles must be produced by a separate engineer in a different facility and shipped on physical media. Same scripts; different person runs them.
  • Site does not allow USB drives — substitute with the site's approved transfer mechanism (Cross-Domain Solution, courier-tape, …). The bundle is just a .tar.gz; transport agnostic.
  • Fully classified environment with no GitHub access on the staging side either — you need a one-time delivery from someone with access to GitHub. Pre-staged bundles can be re-bundled and signed by a chain of authorized custodians.
  • Multiple air-gapped sites needing simultaneous installs — produce one bundle, ship copies; or set up a local image registry inside the air-gap and have airgap-deploy.sh push to it instead of containerd directly (script does not do this today; future enhancement).