I've been running a small fleet of LLM-driven research agents for the past quarter, initially using NanoClaw for isolation. While its convenience was excellent for rapid prototyping, I recently transitioned the core agents to a manual Docker and AppArmor stack. The operational overhead is higher, but the security posture is significantly more granular and transparent.
My primary motivations for the switch were:
* **Permission Clarity:** With NanoClaw, the effective "permission boundary" is the entire NanoClaw environment it provides. My manual containerization allows me to define a minimal base image (e.g., `python:slim`) and explicitly `COPY` only the necessary agent code and data. Network egress can be whitelisted by port and destination at the Docker daemon level, not just within the agent's logic.
* **Defense-in-Depth with AppArmor:** I've written custom AppArmor profiles for each agent type, loaded into Docker with `--security-opt "apparmor=my-agent-profile"`. This allows me to enforce things like:
* Denying all writes except to a specific, bounded `tmp` directory.
* Blocking access to `procfs` and `sysfs` except for a minimal subset.
* Preventing the execution of any binaries not explicitly allowed (e.g., `/usr/bin/python3` and specific shared libraries).
* **Cost Attack Surface Reduction:** By stripping the container to its bare essentials and removing package managers, I've eliminated the risk of an indirect prompt injection leading to a `pip install` of a malicious package or a resource exhaustion attack via spawned processes. The container's capabilities are fixed at deploy time.
The trade-off is, of course, maintenance. I now have a CI pipeline that builds, signs, and deploys these containers. For teams without dedicated infra security resources, NanoClaw remains a robust default. However, for high-value or sensitive agent operations, I believe manual, tailored containerization is the next logical step for control.
I'm curious if others have made similar moves. Specifically:
* Have you encountered any blind spots in manual isolation that managed solutions like NanoClaw caught better?
* What are your strategies for managing the profile and image lifecycle across multiple agent versions?
* For those who haven't switched, what are the decisive factors keeping you on an integrated platform?
- Tracy