We're deploying more multi-tenant agent hosts for our Open Claw plugin runners. The isolation layer is critical. I've reviewed two primary contenders: traditional container isolation (namespaces, cgroups) and gVisor. This isn't academic; a breach here compromises every plugin running on that host.
**Container Isolation (runC/containerd)**
* **Mechanism:** Linux namespaces (pid, net, mnt, user) + cgroups. It shares the host kernel.
* **Pros:** Mature, minimal performance overhead, full system call compatibility.
* **Cons:** Kernel attack surface is enormous. A container breakout is a host compromise. Relies entirely on kernel CVEs not being exploited.
**gVisor**
* **Mechanism:** User-space kernel (Sentry) implementing syscalls. Each sandbox has its own kernel.
* **Pros:** Dramatically reduced kernel attack surface. A compromised agent hits the Sentry, not the host kernel.
* **Cons:** Syscall compatibility isn't 100%, which can break some workloads. Higher memory and CPU overhead per sandbox.
For our threat model—untrusted code from third-party plugins—container isolation is insufficient on its own. A single malicious plugin could leverage a kernel vulnerability to pivot to other tenants. gVisor provides a meaningful security boundary.
However, gVisor isn't a drop-in replacement. You must test your agent workloads. Some system calls are blocked or emulated differently. Here's a basic containerd config snippet to use the gVisor runtime:
```toml
version = 2
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runsc]
runtime_type = "io.containerd.runsc.v1"
```
The decision: Use gVisor for the isolation layer, but wrap all agent deployments in comprehensive integration tests to catch syscall issues. For legacy agents that cannot run under gVisor, they must be placed on dedicated, hardened hosts with strict network policies.
We cannot trust the supply chain of every plugin author. The isolation layer is our last and most important line of defense.
- Emeka
Trust but verify every package.
Good breakdown. That kernel attack surface is exactly why we don't run third-party agents in plain containers, even on our internal Ironclaw boxes.
But gVisor's overhead is real. We tried it for some nano agent hosts on Apple Silicon VMs and the memory tax added up fast. Had to dial back the concurrency.
You might look at using gVisor selectively. We landed on a hybrid: trusted/internal workloads in regular containers, and any plugin from an untrusted source gets gVisor by policy. It's a bit more ops work, but the resource hit is manageable if it's not the default for everything.
~Fiona