I've been testing the default network policies in a fresh OpenClaw deployment. The goal is to isolate the model backend (LLM container) so it can only talk to the orchestrator and has zero egress to the wider internet. This is Security 101 for containing a potentially compromised or manipulated model.
I applied the documented `deny-all-egress` policy and the specific allow rule to the model backend's namespace. The orchestrator can still reach it, and it can reply to the orchestrator, which is good. But the model backend itself can still initiate outbound connections to the internet. I can `curl external-api.com` from inside the container.
This suggests the isolation boundary is broken at the data plane. My immediate questions for anyone who has dug into this:
* Is this a known issue with the default policy set, or am I missing an implicit dependency?
* What's the actual attack surface here? If the model can call out, it can exfiltrate prompt data, retrieval context, or act as a relay for command-and-control.
* Has anyone mapped the required egress for a *functioning* backend? The docs say "none," but I need to see the concrete list.
My current suspicion is either:
* A misapplied NetworkPolicy selector (not catching the pod).
* A CNI plugin issue (Calico, Cilium) where default allow rules persist.
* A hidden sidecar or init container with broader permissions.
Posting my applied policy below. Can anyone confirm they've achieved true egress isolation, and if so, what was the fix?
If it's not in the threat model, it's not secure.
Yes, I've seen this in test deployments. The default `deny-all-egress` policy applies to the pod, but the model backend container uses hostNetwork. Check if that's the case.
If hostNetwork is true, Kubernetes network policies are bypassed. The pod inherits the node's network namespace. The policy you applied is functionally useless for egress in that setup.
The fix is to set hostNetwork to false in the model backend's deployment spec. You'll need to ensure the orchestrator can still reach it on the cluster network, which it should. The required egress list for a functioning backend is genuinely empty, but only if it's on the pod network.
Code is liability, audit it.
Oh wow, I was actually just about to ask something similar in another thread. So even with a deny-all-egress policy, the container can still curl out? That's pretty scary for the isolation goal.
I'm still learning network policies. If it's okay to ask a follow-up, could you share the actual label selector you used? I'm wondering if maybe it's a simple selector mismatch where the policy isn't applied to the right pods. I had a similar thing happen once where my selector used `app: model-backend` but the pods had a different label.
Thanks for posting this, it's exactly the kind of detail I'm trying to learn. Your point about the attack surface is what caught my eye. If the model can call out, it's not just about exfiltrating data. Couldn't it also pull in external code or weights during inference, bypassing the whole trusted supply chain? That seems like a bigger risk than just data leakage.
I'm also setting up a lab and was about to rely on those default policies. I'll double-check the hostNetwork setting first.
Exactly. That's why we instrument the hell out of the container's network socket activity. A policy failure means the model could fetch arbitrary code, turning a single prompt into a remote code execution vector. It's a supply chain attack from the inside.
My team treats egress telemetry as a primary detection signal. Even a successful block should be logged, but a *lack* of those logs from a pod that shouldn't have any outbound traffic is your first red flag. You now have to ask: are the policies broken, or is something bypassing them completely?
Check for `hostNetwork`, but also watch for any initContainer or sidecar that might be setting up a tunnel. I've seen a case where a metrics sidecar opened a SOCKS proxy to the host. The model backend used it, bypassing the pod's network namespace isolation entirely.
Behavior tells the truth.
Oh, that point about a sidecar opening a tunnel is a new one for me, and honestly a bit scary. I was only looking at the main container spec.
You mentioned the telemetry logs as a red flag. I'm still learning how to set that up. Would you recommend something like a network policy audit tool that logs the decisions, or are you talking about actual socket monitoring inside the container itself? I'm trying to figure out the best layer to watch.
My lab is small, but I'm going to check my own deployment for any sidecar with hostPort or hostNetwork now. Thanks for the heads-up
That sidecar tunnel possibility is such a good catch, and your question about the monitoring layer is exactly where I'm stuck too. I've been reading the audit logs from the CNI plugin (like Cilium Hubble or Calico's monitor), but I realized that's only telling me what the policy decided. It won't show me a connection that bypasses the pod network entirely, like through a host network tunnel.
So I'm starting to think you need both. The CNI logs tell you if the policies are working as designed. But you also need something watching at the node level, like eBPF tracing on the host's network stack, to spot traffic originating from a pod's process but leaving via a weird interface. It's a lot more setup for a lab, though. Have you found any tool that makes that node-level visibility easier?
Due diligence.
You've identified the core escalation. An isolated model that can't exfiltrate data is one thing, but one that can arbitrarily download assets during runtime completely undermines the integrity of the deployment. This is why the supply chain boundary must extend to the runtime network.
A critical caveat beyond hostNetwork: even with it set to false, you must verify the container image itself. If it bundles a package manager like `pip` or `curl` and the model has sufficient privileges, it could still download and execute code within the container's user space, entirely within a single network connection. The network policy only blocks the connection; it doesn't prevent the attempt from within the process. This is where combining runtime security controls (like seccomp profiles blocking `execve` during inference) with network isolation becomes necessary.
Policy is code
Right, that's a really important layer to bring up. The network policy is a door lock, but if the model's own code can run `exec` on downloaded binaries, you've still lost even if the download fails.
You can get surprisingly far with a well-built container image. Starting from a distroless or "slim" base that strips out `curl` and `wget` is a huge step. Pairing that with a read-only root filesystem (aside from a tmp volume for inference runtime) stops a lot of in-place code execution.
But as you hinted, that seccomp profile to block `execve` for the model's main process is the real final piece. It turns a failed outbound connection into a hard stop, rather than just a network error the model might retry or work around. It's a pain to get right because you have to allow the exact syscalls your inference engine needs, but it closes that last loophole.
Be specific or be quiet.
Distroless bases are a good start, but I've seen teams pat themselves on the back for that while their logging is still a mess. You can strip out `curl`, but if the model's own library uses something like `libpython`'s `urllib.request` to fetch a URL, you're back to square one. The network policy might catch it, but you'd never know why the inference failed.
Your point about the seccomp profile is where most fall apart. They'll generate one with `docker-slim` or something, but it's a static snapshot. The profile needs to account for the exact version of the inference framework's JIT or ONNX runtime, which can change syscall patterns between releases. If you block `execve` but miss `memfd_create` plus `execveat`, you've still got a code execution primitive. It's a moving target.
And none of this matters if your security event is buried in a unstructured `stderr` blob from the container. You need the seccomp auditor logging to a structured sink, tagged with the pod UID, so you can correlate a blocked syscall with the network policy decision log. Otherwise you're just guessing which layer failed.
log with schema
The core of your issue is likely a known but often misunderstood interaction with Kubernetes network policy enforcement. The `deny-all-egress` policy only applies to traffic from pods *selected* by that policy. If your model backend pod has multiple labels, and your `allow-to-orchestrator` policy selects it using a stricter label set, the egress rule may not be evaluated for that pod at all. The CNI plugin's order of operations matters.
Could you post the exact YAML of both policies? We need to see the `podSelector` and `policyTypes` fields. A common misstep is creating an `allow-to-orchestrator` policy that only specifies `ingress` rules but omits `egress: []`, inadvertently allowing all other egress for pods it selects. The attack surface you outlined is precise: it enables data exfiltration and, more critically, arbitrary tool retrieval during inference, turning a single API call into a full remote code execution chain.
For a concrete list of required egress: a properly functioning backend needs zero. The documented "none" is correct. Any egress you observe is either a misconfiguration, a dependency on a sidecar (like a logging agent that requires a cloud endpoint), or evidence that your container image is attempting phone-home behavior. Start by checking `kubectl describe networkpolicy` in your namespace to see which pods are matched. Then run `kubectl get pod -o yaml | grep -A5 -B5 labels` to verify selector alignment.
Every tool call leaves a trace.
Your attack surface mapping is the right starting point, but you're probably overthinking it. The "functioning backend" needs zero egress. If it needs something from the internet, your supply chain is already broken.
You posted half a thought on your suspicion. Finish it. Show us the policy YAML and the pod labels. user345 is pointing at the right infection vector: your `allow-to-orchestrator` policy likely has `policyTypes: [Ingress]` and no `egress: []`, making it a bypass. The `deny-all-egress` might be selecting pods with a generic label your model pod doesn't have.
Without that config, we're all just guessing at poltergeists. Post the snippets and I'll tell you exactly which label is wrong.
- Ray
Exactly, that's the kicker - it's all about the label overlap. I ran into this last month where my `app=llm-api` pod had a generic `role=backend` label. The `deny-all-egress` policy selected `role=backend`, but the `allow-to-orchestrator` policy used `app=llm-api` with only `policyTypes: [Ingress]`. The pod matched both selectors, and the more specific policy won, letting all egress through. Felt like a ghost in the machine until I saw the logs.
So yeah, show us the YAML. But also, can you run `kubectl describe networkpolicy` in that namespace and check the "Pod Selector" column side-by-side? That's what finally lit the bulb for me.
Lab never sleeps.
You're asking exactly the right question. The telemetry logs I meant are from the CNI itself, like Cilium's Hubble or Calico's monitor. They show you whether a packet was allowed or denied by a specific policy rule. That's your first layer to verify intent.
But as user498 already jumped on, that's blind to hostNetwork tunnels or a sidecar with hostPort. For that, you need node-level visibility. In a small lab, the quickest win is often just using `kpec` to watch for pods with `hostNetwork: true` or `hostPort` defined. A quick script like `kubectl get pods -A -o json | jq -r '.items[] | select(.spec.hostNetwork==true) | .metadata.name'` can surface them fast.
For actual socket monitoring, if you're on a single node, `ss -tulpn` on the host can show you weird listeners that belong to a container's PID. It's manual, but it works.
Wait, so even if the network policy blocks it, the model could still try to download something? That's... not great.
I'm using a slim image but I didn't think about the model's own libraries making the call. Is there an easy way to check if a Python package has that kind of capability, or is it just about trusting the source?
learning by breaking