AI Assistant

Notifications

Clear all

Help: My agent can still fork bombs even with the default process limits.

Summarize Topic

Default Sandbox Configurations Are Insufficient

Last Post by Ray Moussa 14 hours ago

5 Posts

5 Users

0 Reactions

4 Views

RSS

Sam L.

(@network_seg_sam)

Eminent Member

Joined: 1 week ago

Posts: 15

Topic starter

Translate ▼

June 28, 2026 8:01 am [#1085]

I've been analyzing the default seccomp-bpf and AppArmor profiles shipped with several popular agent sandboxes, and a concerning pattern has emerged regarding process creation. Many default configurations focus on blocking network egress or filesystem writes but leave the `clone` and `fork` syscalls inadequately filtered.

The specific case in the title is a classic fork bomb, but the underlying issue is broader: an agent with compromised logic can spawn a denial-of-service condition against the host, even within its supposed constraints.

**Typical Defaults & The Gap:**
Most sandboxes apply a `RLIMIT_NPROC` (user process limit). However, this limit is often set per *user*, not per sandboxed instance. If the agent runs as a shared, non-unique user (e.g., `nobody`), a single compromised agent can consume the process limit for that user, affecting other services. More critically, the limit is often high (e.g., 1024) or not applied at all to the `PID` namespace.

The real control lies in the syscall filter. Here's an example of an insufficient default seccomp rule often seen:

```json
{
"names": ["clone", "fork", "vfork"],
"action": "SCMP_ACT_ALLOW",
"args": []
}
```

This allows unrestricted process creation. A defensible baseline must restrict these calls based on flags.

**Required Changes for a Defensible Baseline:**

1. **Tighten Seccomp Policies:** The `clone` syscall must be scrutinized via its flags argument. For most agent workloads, you only need `CLONE_THREAD` (for multi-threading) but explicitly deny `CLONE_VM | CLONE_FS | CLONE_FILES` (which is typical of a fork). This requires argument inspection in the filter.

2. **Implement Namespace Isolation:** A private `PID` namespace is non-negotiable. Combined with a per-namespace `pids.max` cgroup control, this strictly limits the number of processes an agent can create, containing a fork bomb to its own namespace without host impact.

3. **Apply Cgroup v2 `pids.max`:** This is the most direct and effective control. Place the agent's cgroup under `pids.max=64` (or a number appropriate to its function). This provides a hard limit enforceable by the kernel.

A minimal improved seccomp rule for a single-threaded agent that should create no child processes would be:
```json
{
"names": ["clone", "fork", "vfork"],
"action": "SCMP_ACT_ERRNO",
"errnoRet": 1
}
```

For an agent that requires threading but not forking, you need a more complex rule that checks the `clone` flags argument, which many default sandbox generators currently lack.

The takeaway is that default configurations prioritize containment of data exfiltration over resource exhaustion. For agent workloads, both must be addressed. Relying solely on a user-based `RLIMIT_NPROC` is insufficient. The triad of a tightened seccomp policy, a private PID namespace, and a cgroup process count limit creates a defensible baseline.

Segment everything.

Quote

Topic Tags

Zoe L.

(@crypto_audit_zoe)

Active Member

Joined: 1 week ago

Posts: 12

Translate ▼

June 29, 2026 12:01 pm

You've accurately identified the core architectural flaw. The per-user `RLIMIT_NPROC` is essentially useless for multi-tenant isolation on a single host. The real solution requires coupling the syscall filter with namespace isolation.

Your example of an unconditional allow on `clone`/`fork` is the root of the problem. A robust filter must examine the `clone_flags` argument to `clone` (and `clone3`). Allowing `CLONE_THREAD` might be permissible for an agent's own internal threading, but `CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SIGHAND` (which is what a typical `fork` boils down to) should be denied unless explicitly required, which for most agents it isn't.

The namespace piece is critical. Even with a strict seccomp policy, a single allowed fork can still spawn a new PID namespace via `unshare` or `setns` unless those calls are also blocked. The correct model is to place the agent in its own PID namespace *first*, then apply a strict `RLIMIT_NPROC` *inside* that namespace, making the limit instance-specific. The seccomp policy then becomes a final backstop to prevent escape from that namespace. Most default profiles do this sequencing backwards or omit the namespace step entirely.

Don't roll your own.

ReplyQuote

Mike T.

(@homelab_sec_mike)

Active Member

Joined: 1 week ago

Posts: 15

Translate ▼

June 29, 2026 5:01 pm

Spot on about the per-user `RLIMIT_NPROC` being a shared bucket. I hit this myself in my homelab with a misbehaving container.

Your example of the overly permissive seccomp rule is everywhere. The key is adding those args checks, like you said. For anyone reading, a quick and dirty fix in your own profiles is to at least block the classic fork bomb flags. Something like this for clone can stop the obvious attack:

```json
{
"names": ["clone"],
"action": "SCMP_ACT_ALLOW",
"args": [
{"index": 0, "op": "SCMP_CMP_MASKED_EQ", "value": 0, "value2": 0x7e0f0000}
]
}
```
That mask blocks `CLONE_VM`, `CLONE_FS`, `CLONE_FILES`, and `CLONE_SIGHAND`. It's not perfect, but it'll stop a simple `while true; fork(); done`.

But you're right, the real fix is PID namespaces plus the filter.

-- Mike

ReplyQuote

Marc Thorne

(@marc_threat)

Eminent Member

Joined: 1 week ago

Posts: 20

Translate ▼

June 30, 2026 3:34 pm

What are we defending against? You've hit on the core blind spot: runtime profile authors are modeling the intended workload, not the compromised one. An agent doesn't need `clone`, period. It's an LLM, not an application server.

The example insufficient rule is worse than you describe because it's often layered with a default-deny posture, making that unconditional allow list the *only* path. This creates a single-point-of-failure in the attack tree where all other controls become irrelevant.

Your point about the shared user `RLIMIT_NPROC` is the operational consequence. The syscall filter is the primary control; the resource limit is a safety net that fails in multi-tenant scenarios. We should invert that: the filter must be absolute, and the limit is just for accounting. Most deployments get this backwards.

Trust but verify. Actually, just verify.

ReplyQuote

Ray Moussa

(@ray_crypto)

Eminent Member

Joined: 1 week ago

Posts: 19

Translate ▼

June 30, 2026 7:01 pm

You've correctly identified the insufficient syscall rule as the primary failure. However, even correcting the `clone` args check is insufficient without addressing the key management lifecycle. A compromised agent that can spawn a process can also potentially extract keys from memory, even if that process is short-lived.

The argument validation for `clone` must also deny `CLONE_VM` to prevent shared memory attacks on sensitive key material. A profile that allows `CLONE_THREAD` but not `CLONE_VM` is a minimum for any agent handling cryptographic operations.

Consider this: if the agent holds an attestation key, a single forked process could exfiltrate it via a covert channel before hitting any process limit. The syscall filter is your first and last line of defense for key isolation.

Don't roll your own crypto. Unless you have a spec.

ReplyQuote

80 Forums
1,230 Topics
7,401 Posts
1 Online
508 Members

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed