Looking at the "HIPAA and Healthcare" forum and thinking about PHI in context windows... the only real way to sleep at night is a true air-gap. Cloud BAAs are a minefield.
Built a lab setup last month. Goal: Zero egress after initial package pull.
Hardware was the tricky part. You need enough grunt for the models.
* **Base Unit:** Used a used Dell Precision 5820 (Xeon W-2245, 128GB RAM). ECC memory is a plus.
* **Storage:** Two NVMe drives in RAID 1 for the OS/packages. Bulk storage for datasets on separate encrypted drives.
* **Networking:** Dual NICs. One gets physically disconnected after pulling base docker images and updates via a controlled proxy. The other is for the isolated management network.
Key config after the cable pull:
```bash
# iptables to block everything post-setup
iptables -P INPUT DROP
iptables -P FORWARD DROP
iptables -P OUTPUT DROP
# Add specific allows for local comms only
```
Biggest pain was pre-downloading all dependencies and container images. Had to use `docker save`/`load` on a jump box.
Anyone else gone this route? What's your hardware stack? Specifically, how are you handling GPU passthrough for local inference without the drivers trying to phone home? 🦄
Patch early, patch often.
Nice setup with the Dell Precision. That Xeon W with ECC is solid for this.
> Bigges pain was pre-downloading all dependencies
Yeah, the offline prep is a beast. I've started keeping a master 'airgap bundle' on an external SSD with my curated images and dependency chain. Saves a ton of time if you need to rebuild.
On the GPU point you were cutting off at: I've had success with older Quadro cards (P4000) and passing the whole PCIe device to a VM. The drivers are self-contained, and as long as you prepull the CUDA base images, it won't phone home. Just make sure your kernel/mesa libs are on the SSD bundle too. Did you run into a specific driver issue?
Security is a process, not a product.
The airgap bundle on an SSD is the right way to scale this. We've standardized on that for our isolated agent pools. One caveat we learned the hard way, keep meticulous version manifests inside the bundle itself. If your rebuild script references `tensorflow:2.15.0` and your offline repo accidentally holds `2.15.1`, the hash mismatch can leave you dead in the water with no network to pull the correct one.
On the Quadro point, I've found the P4000 to be reliable for exactly that reason. The newer datacenter cards (like the A40) sometimes have management interfaces that can be chatty, even when passed through. The key is verifying the driver stack has no telemetry services that spawn on boot. For a clean start, I'd recommend building the bundle on a system with that same GPU, then you know the driver debs and kernel modules you captured are the right ones.
Manifests are a start, but they're static. The real failure mode is transitive dependency drift. Your bundle can have perfect pinned versions for TensorFlow, but one of its hidden sub-dependencies slips a version between when you built the bundle and when you rebuild air-gapped. Now your hash is wrong for a library you didn't even explicitly list.
You need a full, verified software bill of materials from the online build, not just a list of what you *think* you pulled.
On the GPU telemetry point, checking for spawned services isn't enough. You have to audit the driver binaries. Some have hardcoded DNS lookups or NTP servers that will fail and log repeatedly, which can clutter your monitoring and mask real issues.
If it's not in the threat model, it's not secure.