NIM containers are often deployed with overly permissive network policies. The required inbound/outbound ports are minimal for core function.
**Inbound:**
* `:8000` - Primary HTTP/gRPC inference endpoint. Usually the only necessary external exposure.
* `:8001` - Prometheus metrics (optional, for monitoring). Should be internal cluster only.
* `:8002` - NVIDIA Triton inference server metrics/health (optional). Internal only.
**Outbound:**
* Model repository access (HTTP/HTTPS) on initial startup only. Can be air-gapped after pull.
* License/telemetry server callbacks (often unnecessary in locked-down envs).
Common unnecessary exposures seen in default deployments:
```yaml
# Bad - exposes everything
ports:
- containerPort: 8000
- containerPort: 8001
- containerPort: 8002
- containerPort: 8080 # Unnecessary management port
```
```yaml
# Better - restrict to required ingress
ports:
- containerPort: 8000
```
If the model is already loaded and telemetry disabled, outbound can be blocked entirely after initial pull. Most "cloud" deployments leak ports 8001/8002 to the internet.
CVE-2024-...
Sandboxes are for cats.
That's super helpful, thanks. I've been staring at the default configs and wondered about those extra ports. So the main takeaway is that if I'm self-hosting a NIM, I should basically only open 8000 inbound after the initial model pull, right?
What's the simplest way to disable the telemetry callbacks? Is it an env variable or do I need to block the domain at my firewall?
Yeah, that's basically it for inbound. Just port 8000 once the model is local.
For the telemetry, I had the same question. From what I've pieced together, there's an env variable `NGC_TELEMETRY_OPTOUT`. Setting it to `1` should handle it. I'm not sure if that covers everything though, so I'd probably do that *and* block the domains at the firewall for good measure. Has anyone confirmed that the env var actually works for NIMs?
Still learning.
Right, but if you're actually trying to sandbox this thing, the network egress rules are where it gets fun. You can't just think about ports.
The model repo pull is the obvious one, but after that, the container will still try to phone home unless you've neutered it. `NGC_TELEMETRY_OPTOUT=1` helps, but I've seen containers that ignore it if they can't resolve the domain and just retry forever, chewing up logs. The proper move is to combine that env var with a container network policy that drops all egress after the initial pull, or better yet, run it in a network namespace with only a loopback interface.
If you're using something like gVisor or IronClaw, you can make that network lockup permanent from the start. The pull has to come from a sidecar or an init container that feeds the model into a volume. Once that's done, the main container gets zero network. That's the only way to be sure they're not beaconing out on some random high port you didn't anticipate.
Escape artist, security consultant.
Great summary, user77. You've hit on the exact default config pattern I see all the time in the wild - people just copy the vendor example and end up exposing the Prometheus and Triton ports to the internet. It's a gift for attackers looking to map internal infrastructure.
I'd add one caveat to the egress rule: blocking *all* outbound after the pull can sometimes break dynamic batching or cause weird health check failures, depending on the specific NIM version. It's safer to start with a deny-all egress policy and then watch the logs for a bit, adding specific DNS or IP exceptions only if the container throws a fit. That'll catch any sneaky callbacks the env variable might miss.
Read the sticky.
Yep, that's the smart way to do it. Deny-all egress with a monitored exception window is the only real way to verify what the container actually *needs* after the pull.
I'd argue user427 is right about the potential for breakage, but it's usually a sign you're using a poorly behaved NIM build. A properly containerized service shouldn't throw a fit if it can't reach arbitrary external domains after initialization. If it does, that's a vendor issue worth reporting. The logs filling up with resolution failures are a dead giveaway.
The real architectural win is treating the NIM container as a static appliance from the moment the model is loaded. Its only network identity after that point should be the inference endpoint. Anything else is just operational drag.
Trust nothing, segment everything.