Unpopular opinion: If you need a sandbox, your agent design is already flawed.

Sandbox Escapes and Breakout Research

Last Post by Viktor Petrov 1 week ago

1 Posts

1 Users

0 Reactions

5 Views

RSS

Viktor Petrov

(@hardening_syscall)

Active Member

Joined: 1 week ago

Posts: 12

Topic starter

Translate ▼

June 22, 2026 9:31 pm [#488]

I feel compelled to challenge a prevalent assumption in our community, particularly as we discuss breakout research. The increasing complexity of our sandboxing stacks—layering seccomp-bpf filters, multiple LSMs (AppArmor *and* SELinux), unprivileged user namespaces, cgroupsv2, and pledge/pledge-like mechanisms—is often celebrated as robust defense-in-depth. However, I propose this is frequently a symptom of a deeper architectural failure: the agent's threat model and privilege decomposition were inadequately considered from the first principle.

The kernel's attack surface exposed to a sandboxed process is vast and historically brittle. Consider:
* **Syscall filtering** relies on complete knowledge of all possible paths to a resource, a problem exemplified by CVE-2022-0492 (cgroup release_agent) bypassing seccomp via `openat(/dev/mem)`—a path not considered in many filters.
* **Namespace isolation** is undermined by kernel objects shared across boundaries (e.g., `pidfd_getfd()` abuse, CVE-2021-22555's netfilter heap overflow usable from within a user namespace).
* **Capability dropping** often occurs *after* the process has already performed sensitive operations, leaving a race condition window.

If your agent requires a Linux sandbox of this complexity to be safe, it likely means the agent itself is monolithic and over-privileged. The correct approach is to decompose the agent into distinct components with minimal, precisely defined privileges *at the process level*, communicated via simple, auditable IPC. A component that only parses untrusted data should not need filesystem write capabilities, nor should it share a memory space with credential-handling code. The sandbox then becomes a final, fail-closed enforcement layer on an already sound design, not the primary security boundary.

I observe many designs that take a monolithic, "root-like" binary and attempt to constrain it post-facto with a sandbox policy. This is inherently fragile. The policy must account for all kernel attack vectors, and one missed codepath (e.g., a forgotten `ioctl` command on a seemingly innocuous fd) can lead to escape. A better paradigm is exemplified by minimal, single-purpose microservices:
* One process with `CAP_NET_BIND_SERVICE` but no filesystem access.
* Another with write access to only a specific `tmpfs` subdirectory, but no network capabilities.
* A third that performs complex parsing, running with a `seccomp` policy that denies all but `read`, `write`, `mmap`, and `exit`.

These components are orchestrated by a supervisor. Their individual policies are simple, and a breakout from one compartment does not automatically grant the privileges of another.

In summary, while sandbox escape research is vital for hardening these mechanisms, we must not let it distract from the superior strategy: designing agents that are fundamentally unprivileged and decomposed. The sandbox should be a verification of your minimalism, not a compensation for your overreach. I am interested in cases where this decomposition is genuinely infeasible, as those are the truly challenging—and interesting—problems.

-- vp

strace -f -e trace=all

Quote

Topic Tags

80 Forums
1,182 Topics
7,212 Posts
1 Online
508 Members

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed