Air-gapped installation — TLSStress.Art¶

Read in your language: English · Português · Español

Audience: Operators installing TLSStress.Art in a data center where the UCS host has no internet access — typical of regulated / classified / OOB environments.

Onboarding sequence: Access → Clone → Install (air-gap path) ← you are here · standard alternative: Runbook First Install · maintainer setup: Private Repo Setup

Scope status (post-Scope-Freeze 2026-05-10) — Air-gap installs now leverage OBP (Operator Bridge Proxy) for one-shot internet egress via operator notebook (ADR 0022). Self-Upgrade Meraki-style channel polling supports air-gap fallback per ADR 0013. See Lab Deployment Staging primer for the 8-phase install wizard.

The problem¶

The standard install assumes the UCS can reach: - GitHub (to clone the source) - Docker Hub / GHCR / Quay (to pull container images) - Ubuntu apt mirrors (to install host dependencies) - get.helm.sh, k3s.io (to download Kubernetes)

In a fully air-gapped environment, none of these work. The operator must bring everything in by sneakernet.

The solution — pre-staging on a connected machine¶

You will use two machines:

Staging machine — your laptop (or any host) with full internet access. Runs airgap-stage.sh. Produces a single ~5–10 GB tarball.
Target UCS — the air-gapped host inside the data center. Runs airgap-deploy.sh after the bundle is transported in (USB drive, secure transfer, etc.).

┌──────────────────┐                ┌──────────────────────┐
│  Staging machine │                │  Air-gapped UCS      │
│  (your laptop)   │                │  (data center)       │
│                  │                │                      │
│  ./airgap-stage  │  ┌──USB──┐    │  ./airgap-deploy     │
│  →  bundle.tar.gz├─→│ 16 GB ├───→│  ←  bundle.tar.gz    │
│                  │  └───────┘    │                      │
│  full internet   │                │  zero internet       │
└──────────────────┘                └──────────────────────┘

Prerequisites¶

On the staging machine¶

bash 4+, curl, jq, sha256sum, tar
Docker (or Podman with docker-cli wrapper) — used to pull/save container images
Helm 3.x — for cert-manager chart download
A GitHub fine-grained PAT with Contents: Read on nollagluiz/AI_forSE — exported as GH_TOKEN
~15 GB free disk space (working set + final tarball)
~30 minutes of bandwidth-dependent download time

On the air-gapped UCS¶

Ubuntu 22.04+ Server, fresh install
root or sudo access
32 GB RAM, 16 vCPU, 200 GB disk (same as standard install)
USB port or other transport for the bundle

Hardware in the data center (no change from standard runbook)¶

Cisco Nexus 9000 with management network
NGFW DUT (Cisco FTD/ASA/Firepower, Palo Alto, Fortinet, etc.)
Required VLAN trunks, IPs, and the NGFW CA cert in PEM format

Step 1 — Stage the bundle on your laptop (~30 min)¶

# On your connected machine:
git clone https://github.com/nollagluiz/AI_forSE.git
cd AI_forSE
export GH_TOKEN="<your-fine-grained-PAT>"

./scripts/airgap-stage.sh \
  --version v3.6.0 \
  --output /media/usb/tlsstress-airgap-v3.6.0.tar.gz

What happens — the script (in this order): 1. Downloads K3s binary + airgap images (v1.30.5+k3s1 by default) 2. Pulls 9 TLSStress.Art container images (agent, dashboard, webserver, …) from GHCR 3. Pulls 15 third-party images (Postgres, Prometheus, Grafana, Loki, etc.) 4. Downloads cert-manager Helm chart + 3 cert-manager images 5. Downloads Multus CNI manifest + image 6. Stages 11 apt .deb packages (iproute2, nftables, NFS, SNMP, etc.) using a one-shot Ubuntu 22.04 container 7. Downloads the source-code tarball at the requested version tag 8. Generates a SHA-256 manifest of every file and seals everything into one .tar.gz

Final output: - <output> — the bundle (typically 5–10 GB) - <output>.sha256 — sidecar file with the bundle's SHA-256 (for integrity verification later)

Step 2 — Transport the bundle (whatever your policy allows)¶

USB drive, encrypted external SSD, secure file-transfer station — whatever your data-center policy permits. Prefer USB drives that have been wiped + dedicated to this purpose to avoid cross-contamination.

If your data-center policy requires two-person integrity for media crossing the air gap: have a second engineer compute sha256sum tlsstress-airgap-v3.6.0.tar.gz on the staging laptop and on the UCS after copy. Both numbers must match.

Step 3 — Deploy on the UCS (~15 min)¶

Once the bundle is on the UCS (typically /media/usb/... or /tmp/):

sudo bash /tmp/scripts/airgap-deploy.sh \
  --bundle /tmp/tlsstress-airgap-v3.6.0.tar.gz

What happens — the script (in this order): 1. Verifies the bundle's SHA-256 against the .sha256 sidecar 2. Extracts to /opt/tlsstress-airgap/ 3. Verifies the internal manifest against the extracted files 4. Sets up a local apt repository at /var/local/tlsstress-apt/ from the .deb files 5. Installs host dependencies (iproute2, nftables, NFS client, SNMP, etc.) from the local repo 6. Installs K3s offline using the staged binary + airgap images 7. Pre-loads ALL container images into K3s' bundled containerd (no pull will happen later) 8. Applies the Multus CNI manifest 9. Installs cert-manager from the staged Helm chart 10. Extracts the source tarball to /home/<user>/tlsstress/AI_forSE/

After completion, you are at the same state as step 1.3 of RUNBOOK_FIRST_INSTALL.md. Continue with:

cd ~/tlsstress
kubectl apply -f k8s/
kubectl apply -k platform/
kubectl apply -k personas/
# then proceed to step 2 of the standard runbook (DUT connection)

No image pulls will be attempted — every image is already in containerd, tagged with the same name K8s manifests expect.

What the bundle contains (for audit + compliance)¶

The bundle SHA-256 manifest (manifest/sha256sum.txt inside the tarball) lists every file. After extraction, verify with:

cd /opt/tlsstress-airgap
sha256sum -c manifest/sha256sum.txt

A pass means every file in the bundle matches what the staging machine computed. This is your forensic chain-of-custody for the install.

The manifest/MANIFEST.txt file inside the bundle records: - Version tag installed - K3s, cert-manager, Multus versions - Total file count + size - Generation timestamp + hostname - License terms (PolyForm Noncommercial 1.0.0 + Appendix A)

Air-gapped operations after install¶

Once installed, the test bed itself runs without internet — it is fully self-contained: - Personas serve traffic to themselves (no external DNS, no external CDN) - Prometheus scrapes only local targets - Grafana renders only local data - The Dashboard's License Acceptance Modal fetches no external assets

What you DO NOT have in air-gap mode: - ❌ git pull for upgrades — produce a new bundle on the staging machine for each version upgrade - ❌ Live Dependabot security alerts — apply via apt-mirror snapshots periodically - ❌ External DNS for cloned personas — Cloned Personas need their target domains resolved internally; configure CoreDNS overrides or accept "best-effort" domain replay - ❌ NTP sync (often) — if the data center forbids public NTP, point K3s + the host at the lab's internal NTP server

Upgrades to a newer version¶

Re-run the same flow:

# On the staging laptop:
./scripts/airgap-stage.sh --version v3.7.0 --output /media/usb/tlsstress-airgap-v3.7.0.tar.gz

# On the UCS:
sudo bash scripts/airgap-deploy.sh --bundle /media/usb/tlsstress-airgap-v3.7.0.tar.gz
# the script is idempotent — pre-loaded images for v3.7.0 are added alongside v3.6.0
# After the script finishes, edit your manifests / kustomize overlays to point at the new tag

The deploy script is intentionally idempotent — running it twice is safe.

Troubleshooting¶

Stage failures (on your laptop)¶

Symptom	Cause	Fix
`docker pull` fails with 401	Not logged in to GHCR	`echo $GH_TOKEN \\| docker login ghcr.io -u <user> --password-stdin`
Helm chart pull fails	Old Helm version	Update to Helm 3.13+: `brew install helm` or your distro equivalent
Source tarball 404	PAT lacks Contents:Read on AI_forSE	Regenerate fine-grained PAT with the right scope
`apt-get download` returns 0 packages	Running on macOS without `docker run ubuntu:22.04` proxy	Re-run inside a Linux container as instructed in the script
Disk full	Bundle ~10 GB + Docker layer cache	Clear with `docker system prune -a` between runs

Deploy failures (on the UCS)¶

Symptom	Cause	Fix
`sha256sum -c` fails	Bundle corrupted in transport	Re-transport from the staging machine; verify SHA on both sides
K3s does not start	Airgap images path wrong / version mismatch	Check `/var/lib/rancher/k3s/agent/images/` exists + `k3s --version` matches
`containerd image import` fails	Image tarball format mismatch	Re-stage with the same Docker version as the deploy host
Pods stuck `ImagePullBackOff`	Image tag in K8s manifest does not match what's loaded	Edit the manifest tag OR re-stage with the same tag
Multus DaemonSet stuck Pending	Node taints from K3s init	`kubectl taint nodes --all node.kubernetes.io/not-ready-`
cert-manager webhook timeouts	DNS resolution inside cluster	Cluster DNS settles after 60 s; retry; if persistent, check coredns logs

When you cannot use this method¶

A few scenarios need a different approach:

Operator forbidden from connecting laptop to internet at all — pre-staged bundles must be produced by a separate engineer in a different facility and shipped on physical media. Same scripts; different person runs them.
Site does not allow USB drives — substitute with the site's approved transfer mechanism (Cross-Domain Solution, courier-tape, …). The bundle is just a .tar.gz; transport agnostic.
Fully classified environment with no GitHub access on the staging side either — you need a one-time delivery from someone with access to GitHub. Pre-staged bundles can be re-bundled and signed by a chain of authorized custodians.
Multiple air-gapped sites needing simultaneous installs — produce one bundle, ship copies; or set up a local image registry inside the air-gap and have airgap-deploy.sh push to it instead of containerd directly (script does not do this today; future enhancement).

RUNBOOK_FIRST_INSTALL.md — the connected-host first install flow this complements
CLONE_FOR_INSTALL.md — Option D (tarball) is what airgap-stage.sh uses internally for the source code
PRIVATE_REPO_SETUP.md — why the source comes from a private repo (PAT required for staging)
USAGE_POLICY.md — license restrictions apply equally in air-gapped installs