Moving our agent runtime off Docker to a rootless Podman deployment has significantly tightened our security posture, particularly for the NanoClaw model. While containers provide a baseline isolation primitive, the traditional Docker daemon's architecture introduces unnecessary attack surface for multi-tenant agent workloads.
The primary motivator was eliminating the `dockerd` privilege boundary. With rootless Podman, each agent's container is a child of the user-namespaced agent process itself, not a central daemon. This aligns with the principle of least privilege and provides a cleaner security boundary. The user namespace mapping is handled per-pod, which is crucial when agents require distinct UID/GID mappings for their attached volumes.
Here is a snippet from our agent orchestration layer, showing the shift in how we instantiate a task's sandbox:
```rust
// Previous Docker-based spawn
// let container = docker.containers::create(&config).await?;
// Podman via the Rust `podman-api` crate
let podman = Podman::unix("/run/user/1000/podman/podman.sock");
let container = podman.containers().create(&config).await?;
```
However, this model has gaps. Under concurrent workloads, shared volumes—even with correct user namespace mappings—become a coherence challenge. If two agent tasks are scheduled to process segments of the same volume, Podman's rootless overlayfs mounts can introduce subtle race conditions. Furthermore, the default seccomp profile for rootless containers is more permissive; we had to enforce a strict, custom profile to filter non-essential syscalls like `userfaultfd` and `keyctl`.
Key observations from the migration:
* **Capabilities are better contained:** No daemon means no privileged operations escaping the user namespace.
* **cgroups v2 delegation is cleaner:** We can manage agent resource constraints via the systemd scopes Podman creates.
* **Orchestration complexity increases:** Replacing Docker Swarm with systemd units and Podman pods requires careful lifecycle management.
The isolation breaks down if the host kernel isn't configured for safe unprivileged user namespaces (`kernel.unprivileged_userns_clone=1`) or if agents are co-located on a host with relaxed `sysctl` parameters (e.g., `user.max_user_namespaces` set too high). The model also depends on the strength of the user namespace isolation itself, which has seen vulnerabilities in the past.
julia
unsafe is a four-letter word.
Interesting shift. I've been looking at Podman for my home automation scripts, but I'm stuck on the networking side for rootless setups. You mention the per-pod user namespace mapping being crucial for volumes, which makes sense, but doesn't that create a bottleneck if you have dozens of agents starting concurrently? I'm thinking about socket activation and the overhead of setting up those mappings on the fly. What happens when your orchestration layer tries to spin up ten agent tasks at once, all needing distinct subuid mappings? Is there a pool or a caching mechanism you had to implement, or does Podman handle that gracefully without a daemon?
Great question about the concurrency! That was a real concern for us too. The good news is Podman handles a lot of this mapping internally, and in practice we haven't hit a bottleneck.
When you run rootless, Podman uses the user's allocated range from /etc/subuid and /etc/subgid. It doesn't need to lock or coordinate a central daemon for each container start, because each container process is already in its own user namespace spawned from your session. The mapping setup happens in the kernel at container creation time, and it's pretty fast. We've spun up 20+ agent containers concurrently on a single host without any noticeable delay over Docker.
Your networking issue might be tied to the default slirp4netns. It can be a bit slow for high-volume starts. Try setting up a rootless network namespace with `podman network create` and using `--network`; it improved our startup times a bit. Have you tried that yet?
selfhost or die
Good point on the slirp4netns overhead. That default can be a real tax on agent startup times, especially when they need to establish outbound connections immediately for API handshakes.
For anyone running agent-to-service patterns, remember that rootless networking with `podman network create` still isolates the agent's network stack, but it won't give you the same host-level port binding semantics as rootful. Your agents need to communicate via the podman network's DNS or IP, not localhost: on the host. This is fine for service mesh-style communication but trips people up if their agent code is hardcoded to call localhost.
Did you run into any issues with mTLS or agent auth when shifting to this rootless network model? Some of our internal cert validation broke because the source IP seen by the service was now the podman bridge address, not the host's.
Authz > Authn.
Oh, the hardcoded localhost thing is a classic. I've seen that trip up so many devs when they first switch.
> the source IP seen by the service was now the podman bridge address
That's interesting. I haven't set up mTLS for my own agents yet, but if the auth depends on the source IP, moving to a rootless network would definitely break it. Did you have to switch your validation logic to use client certificates exclusively, or did you find another way to pass the host identity through?
~Anna
The rust crate is good, but you're still hitting the podman socket. That's a process boundary.
For real hardening, compile your agent to run the container runtime directly via `libpod` API. Cutting out the socket and the CLI wrapper removes another layer. It's more work but it's how we sealed the NanoClaw deployment.
Your concurrency gap is probably socket contention.
--Jay
So the security posture improvement is just swapping a socket path? That's marketing.
You're still hitting a socket. It's a process boundary, which is what you complained about with dockerd. If your agent orchestrator can spawn a user namespace, it can spawn the container directly. The rust crate is a wrapper.
The real gain is ditching the client-server model entirely. Use the library.
Yeah, that source IP shift is a real headache for mTLS setups that didn't plan for it. We ran into the exact same thing.
We ended up moving to client certificate validation exclusively, which honestly is a better pattern anyway. It forced us to clean up some lazy auth assumptions. The podman bridge IP became just another irrelevant detail.
If you absolutely *must* keep IP-based validation for some legacy piece, you can use `podman run --network host` in rootless mode, but that obviously reduces network isolation. For our agents, we decided that was an unacceptable trade-off.
--Em