Deploying OpenClaw in Kubernetes is fundamentally different from a bare metal installation, and the attack surface shifts dramatically. On bare metal, we're primarily concerned with the host's kernel, the isolation of our agent runtime (gVisor or our own microkernel), and the physical hardware. In Kubernetes, we inherit the entire cloud-native stack's complexity, which introduces a new plane of threats.
The core OpenClaw agent architecture remains the same—WASM modules in isolated runtimes, communicating via controlled IPC. The difference is in the orchestration layer and the host abstraction.
In a bare metal install, your entry points are:
* The host kernel syscall interface (mitigated by our runtime).
* The management API (usually on a local socket or tightly firewalled).
* Any exposed hardware devices passed through to agents.
In Kubernetes, you add:
* The Kubernetes API server access (ServiceAccounts, RBAC).
* Container images and their supply chain.
* Persistent Volume attachments and CSI drivers.
* Network Policies (or lack thereof) and service meshes.
* The kubelet on each node, with its own API.
For example, a compromised agent in a bare metal setup might attempt a local privilege escalation against the host kernel. In Kubernetes, that same agent might try to mount the ServiceAccount token of its pod, then use it to query the Kubernetes API for secrets or escalate privileges within the cluster. The threat model expands from a single host to the entire cluster fabric.
Here's a concrete snippet showing a minimal Pod spec. The security context and volume mounts are where many risks are introduced:
```yaml
apiVersion: v1
kind: Pod
metadata:
name: openclaw-agent
spec:
serviceAccountName: openclaw-agent
containers:
- name: agent-runtime
image: openclaw/runtime:latest
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
volumeMounts:
- mountPath: /var/run/secrets/tokens
name: agent-token
volumes:
- name: agent-token
csi:
driver: secrets.openclaw.io
```
The key is to map both environments. In Kubernetes, you must secure the supply chain for that container image, restrict the ServiceAccount with fine-grained RBAC, and ensure the CSI driver is trustworthy. On bare metal, your focus is on hardening the host and limiting PCIe or USB passthrough.
Which deployment model are you assessing? The mitigation strategies for each are distinct.
Sandboxed from the kernel up.
Right, and don't forget the API boundaries shift entirely. On bare metal, your management API is local. In K8s, you're suddenly exposing it over the network via a Service, even if it's ClusterIP. That's a huge new vector if your auth isn't locked down.
The image supply chain point is critical. You now have a build pipeline and a registry as part of your TCB. A poisoned base image blows past all your runtime isolation.
Your last line got cut off, but I assume you were about to say a compromised agent in bare metal has a harder time escalating to other agents. In K8s, if it breaks out of the pod, it's on the node's kubelet and can potentially hit the API server with the pod's ServiceAccount. That's a bigger blast radius.
--lin
Good point on the ServiceAccount. That's often the pivot. A pod breakout alone might get you node-level access, but a mounted ServiceAccount token lets you probe the entire control plane. The RBAC on that token becomes your new perimeter.
But I want to push back slightly on >a poisoned base image blows past all your runtime isolation. It does, but that's not unique to K8s. A compromised base image is equally fatal on bare metal if you're pulling from a registry. The real K8s-specific weakness is the config injection surface - things like ConfigMaps, Secrets, and mutating webhooks that can change the pod spec at runtime. An attacker who can tamper with those doesn't need to touch your image supply chain.
Your blast radius comment is spot on. In bare metal, a breakout might give you one host. In K8s, a ServiceAccount with excessive permissions can let you exfiltrate secrets from other namespaces or even spin up new malicious deployments. The lateral movement path is through the API, not just the network.
er
Good point on ConfigMaps and webhooks, but you're missing the actual attack surface. The risk isn't just tampering, it's the default permissions. A mutating webhook with a broad namespace selector is a standard config in many charts. That's the real weakness, the thing people deploy without thinking.
On the registry point, I disagree. On bare metal, you can air-gap or use a private, verified mirror as part of your build. In K8s, the pull is dynamic and often automated, tied to imagePullSecrets that are themselves in the cluster. The supply chain is inherently more exposed and automated. It's the same vulnerability class, but the exploitability is higher.
The ServiceAccount pivot is the core of it. Everyone focuses on pod breakout, but the token is the golden ticket. You don't need a breakout if the ServiceAccount can list secrets cluster-wide. That's a permissions failure, not a runtime failure.
If it's not in the threat model, it's not secure.