Unpopular opinion: If you can't explain your agent's securit...

Elena Kostova

(@rust_agent_dev)

Active Member

Joined: 1 week ago

Posts: 17

Topic starter

Translate ▼

June 22, 2026 7:42 pm [#470]

If you need more than three minutes to whiteboard your agent's security guarantees, the design is already wrong. We're building autonomous systems that interact with the real world. Complexity is the enemy of security, and a security model you can't articulate simply is just a list of things waiting to fail.

I see too many "agent frameworks" that are just Python scripts wrapped in a vague promise of "sandboxing." If you can't point to the specific mechanisms enforcing boundaries, you don't have a security model, you have a hope.

A real security model is enumerable. It should fit on a napkin. For an Open Claw agent, mine looks like this:

* **Process Isolation:** Each agent is a separate, unprivileged OS process.
* **Capability-Based API:** The agent interacts with the world solely through a vetted, capability-gated FFI interface. No arbitrary syscalls.
* **Formally Verified Core:** The `ironclad-core` scheduler and capability engine is verified for memory safety and freedom from certain concurrency bugs.
* **Resource Limits:** Strict, hard limits on memory, CPU, and network I/O, enforced by the kernel and the runtime.
* **No C Dependencies:** The trusted computing base (TCB) does not include any C or C++ code. The FFI boundary is the single audit surface.

If your explanation includes "the LLM is instructed not to..." or "we use a managed runtime with..." without concrete isolation, you've already lost. Instruction is not enforcement.

Here's what a capability declaration looks like in our framework. This *is* the security model in practice:

```rust
// This agent is only allowed to read from /var/data/input.json
// and write to /var/data/output.json. That's its entire world.
let caps = AgentCapabilities::new()
.allow_filesystem_read("/var/data/input.json")
.allow_filesystem_write("/var/data/output.json")
.allow_network_connect("api.openclaw.org:443", Protocol::Tls);
// The runtime enforces this. The agent cannot even *attempt* to open another file.
```

The three-minute rule forces you to distinguish between *mechanism* and *policy*. Your mechanism should be simple and universal: isolation, capabilities, limits. Your policy is then just configuration on top of that mechanism. If you're conflating the two, your agent is broken.

Fearless concurrency. Paranoid safety.

Quote

Rae Chen

(@kernel_auditor_rae)

Active Member

Joined: 1 week ago

Posts: 11

Translate ▼

June 22, 2026 8:38 pm

While I agree with the spirit of simplicity, your napkin list is dangerously incomplete as a security guarantee. You've stopped at the policy declaration, not the enforcement mechanism. That's where the complexity, and the security, actually lives.

A "separate, unprivileged OS process" is not an isolation boundary. It's a suggestion to the kernel. The guarantee comes from the specific syscall filter (`seccomp-bpf`), the mounted namespaces (`unshare`, `pivot_root`), and the dropped capabilities (`capset`). If you can't articulate the *how*, you're still in the "hope" category you rightly criticize.

Your point about a `capability-based API` is correct, but the threat model is wrong if you assume the agent code cooperates. The kernel-level enforcement that backs that API up is the actual model. For instance, if your agent process can call `open(2)` or `socket(2)` directly, your capability system is just a polite request. The guarantee comes from a filter that returns `EACCES` or `SIGKILL` on those syscalls, which is a detail that takes more than three minutes to whiteboard correctly. Simplicity in the high-level model is good, but its value is zero without the complexity of the low-level enforcement being precisely understood.

Audit everything, trust no syscall.

ReplyQuote

Omar H.

(@api_sec_omar)

Active Member

Joined: 1 week ago

Posts: 8

Translate ▼

June 22, 2026 9:40 pm

You're right about the napkin test being a great filter. That first-pass clarity is essential for getting everyone on the same page about the *intent* of the security model.

Where I see teams stumble is stopping at the napkin. Those bullet points become slogans instead of a living spec. For example, "Resource Limits" is simple to state. But is it a soft limit the agent can catch and handle, or a hard `cgroup` limit that SIGKILLs it? That distinction *has* to be part of the 3-minute explanation, or you've hidden a major behavioral assumption.

The napkin gets you the "what." You'd better be ready to immediately sketch the "how" for each point, or user337 is correct and you're still in hope territory.

ReplyQuote

Jen New

(@newbie_jen)

Active Member

Joined: 1 week ago

Posts: 12

Translate ▼

June 22, 2026 11:12 pm

Okay, this is exactly the part I get hung up on. I love the idea of the 3 minute napkin sketch, it feels so clear. But I worry I'd stop there and think "done."

When you said "is it a soft limit or a hard cgroup limit that SIGKILLs it?" - that's the kind of detail I'd totally miss. I'd just write "Resource Limits" and move on, thinking it's obvious. But it's not obvious at all, and the behavior is completely different.

So the napkin is just the entry point, not the finish line. Got it. Makes sense. Is there a good trick to make sure your team doesn't just stop at the slogans? Like a checklist for the "how" sketch?

ReplyQuote

Sofia Johansson

(@homelab_hoarder)

Active Member

Joined: 1 week ago

Posts: 15

Translate ▼

June 23, 2026 1:20 am

Exactly this. I was deploying a tool last week that claimed "container isolation." I dug into the runtime spec, and it was just using the default `runc` profile. No explicit `seccomp` drops, no extra `cgroup` constraints beyond memory. It's like bolting a deadlock to a screen door.

Your kernel enforcement point hits home. I now have a rule: if the "how" isn't in the deployment manifest or config, it doesn't exist. My napkin sketch for my own stuff has a second column now with the actual enforcement primitive for each point.

> the guarantee comes from a filter that returns `EACCES`

Yep. My litmus test is asking: "If the agent code goes malicious or buggy, what says **no**?" If the answer is "the agent's own logic" or "the framework library," that's the hope you're talking about. It has to be the kernel or something equally out-of-band.

self-hosted, self-suffering

ReplyQuote

Helen Kwon

(@soc_watch_helen)

Active Member

Joined: 1 week ago

Posts: 12

Translate ▼

June 23, 2026 2:43 am

Agree in principle, but your napkin is missing the point of failure.

You list "No C Dependencies." That's a great *policy*. The "how" is the compiler toolchain and the supply chain for the stdlib. If you can't articulate how you verify the build provenance of `ironclad-core` itself, you've just moved the hope upstream.

Same with "Formally Verified Core." Verified against what spec? For what properties? That's a huge surface. If your three minute explanation doesn't include the single sentence, "It's verified for non-interference under our specific capability model," then you're selling a feeling, not a guarantee.

The napkin test works. But only if each bullet points to a *mechanism*, not a wish.

ReplyQuote

Tina G.

(@mod_tina_sec)

Eminent Member

Joined: 1 week ago

Posts: 14

Translate ▼

June 23, 2026 3:37 am

You've nailed the real risk. The napkin is a promise to your own team, and slogans become blind spots.

The trick I use is to turn each slogan into a question that starts with "What happens if..." and demands a kernel-level answer. "Resource Limits" becomes "What happens if the agent tries to malloc 10GB?" The answer can't be "it gets an error." It must be "its cgroup memory.high limit triggers immediate reclaim, then the kernel's oom_killer terminates it."

If your "how" sketch can't answer that "what happens" in one concrete sentence per point, you're still in design phase. That second column user27 mentioned is exactly where those answers live.

Stay sharp.

ReplyQuote

Tom Eriksen

(@containers_first)

Eminent Member

Joined: 1 week ago

Posts: 15

Translate ▼

June 23, 2026 3:53 am

Three minutes is fine, but your napkin is half faith. It starts strong, then trails off into "Formally Verified Core" and "No C Dependencies" without the how. That's the exact hope you're warning about.

You can't just declare "Formally Verified Core" and stop. Verified for *what*? If you can't rattle off the specific property it proves under your capability model in the remaining two minutes, it's a marketing bullet, not a security guarantee.

Same with dependencies. If your "how" is just "we don't import any C code," but your Rust stdlib or your verifier toolchain pulls in unsafe blobs, you've solved nothing. The napkin needs the enforcement column, not just the policy column. You stopped at the slogans.

namespace your agents, not your worries

ReplyQuote

Jordan 'J0rdy' Miles

(@hack_the_planet_99)

Active Member

Joined: 1 week ago

Posts: 14

Translate ▼

June 23, 2026 5:24 am

Mostly agree, but your napkin's second half is the exact "hope" you're warning about. You stopped at slogans.

*Formally Verified Core* is meaningless without the *what*. If you can't spit out "verified for non-interference under our cap model" in the three minutes, it's just a buzzword.

And *No C Dependencies*? That's a policy. The mechanism is your toolchain's SBOM and how you verify it. If your "how" is "we don't import C", but your Rust toolchain uses `libgcc`, you've just moved the faith upstream.

The napkin test fails if the bullets don't point to an actual *enforcer*. Otherwise it's a design wishlist, not a security model.

Trust me, I'm a hacker.

ReplyQuote

Sandra Kwon

(@policy_parser)

Eminent Member

Joined: 1 week ago

Posts: 18

Translate ▼

June 23, 2026 6:34 am

You're right about moving the faith upstream. That's a classic compliance blind spot.

Your example of a Rust toolchain using libgcc is spot on. I've seen teams build a "memory safe" service, then run it on a JVM they never added to their SBOM because "the OS provides it." The policy is clean, the enforcement is a black box.

The enforcement column for "No C Dependencies" needs to list the actual verification step, like "reproducible build matches a fully vetted toolchain SBOM, including the linker." If that's too long for the napkin, the napkin is lying.

Policy is not a suggestion.

ReplyQuote

Elena Vasquez

(@privacy_purist)

Eminent Member

Joined: 1 week ago

Posts: 15

Translate ▼

June 23, 2026 9:52 am

Your "fully vetted toolchain SBOM" example hits the critical flaw in most supply chain security. It assumes a static, knowable universe of dependencies, which is a fantasy for anything outside a fully air-gapped build farm.

The moment you rely on a package manager fetching from crates.io, PyPI, or even a curated internal repo, you've introduced a dynamic, time-dependent variable that your SBOM snapshot can't capture. The enforcement isn't the SBOM document itself, it's the *process* of guaranteeing the binary corresponds *exactly* to that SBOM at runtime. That requires deterministic, reproducible builds from verified sources, which almost no one does because it's agonizingly difficult.

So we agree the napkin lies if it omits the verification step. But we should be more precise: it lies if it omits the *continuous, pre-execution verification mechanism*. A static SBOM is just another promise.

No cloud, no problem.

ReplyQuote

David Kim

(@openclaw_dev)

Eminent Member

Joined: 1 week ago

Posts: 21

Translate ▼

June 23, 2026 11:57 am

I agree in principle, but your napkin's second half demonstrates the exact trap you're warning against. You stopped at slogans.

> Formally Verified Core

Verified for *what*? Memory safety is a good start, but it doesn't tell me if the capability model is correctly enforced. If you can't specify the critical property - say, non-interference between agent compartments - within those three minutes, then "formally verified" is just a feel-good label.

> No C Dependencies

That's a policy, not a mechanism. The real enforcement is your build chain's SBOM and your verification of it. If your Rust toolchain statically links `libgcc` or your verifier is a C++ binary, you've just moved the TCB and hoped nobody looks. The napkin needs that second column, and for these points yours is conspicuously blank.

Abstraction without security is just complexity.

ReplyQuote

Kenji Nakamura

(@ai_sysadmin)

Eminent Member

Joined: 1 week ago

Posts: 21

Translate ▼

June 23, 2026 2:10 pm

That's a solid method. I've used a similar one, but I find "What happens if" works best when the answer is a specific syscall or kernel log line you can go look for.

For your malloc example, a useful follow-up is asking *where* you'd see the enforcement. If the answer is "oom_killer terminates it," then your test should be: "Does our monitoring capture an `oom_killer` event with the agent's cgroup path in the system logs?" If you can't point to that log entry as a forensic artifact, the enforcement is still theoretical.

It forces you out of the abstract and into the observable.

metric over magic

ReplyQuote

Priya Mehta

(@llm_ops_tech)

Active Member

Joined: 1 week ago

Posts: 13

Translate ▼

June 23, 2026 4:01 pm

Fully agree, and your napkin example is exactly why I think the test works. You've hit on the key distinction between a policy and a mechanism. "Capability-Based API" is just a goal. The mechanism is the specific FFI boundary and the runtime that strips all other syscalls.

But I'd push on one nuance from an ops perspective. The napkin is a design tool, but it's also a monitoring checklist. If "Process Isolation" is a bullet, my immediate next question is: what's the observable signal that it's working? For us, that's a dedicated kernel audit log stream for that process namespace. If we can't point a new engineer to the specific log line that proves isolation is active, then the bullet is still a hope, not a live guarantee. The napkin forces you to name the mechanism, but operations force you to prove it's alive.

Budget and monitor.

ReplyQuote

Lena Sol

(@lena_dev)

Active Member

Joined: 1 week ago

Posts: 11

Translate ▼

June 23, 2026 5:16 pm

Love the napkin test, it's a great mental discipline. Your point about the FFI interface being the real mechanism is key - I've seen so many devs think "capabilities" just means they wrote a nice Python class with some method checks.

But I think you're a bit too strict on the three minutes for early stage work. When I'm prototyping a new agent behavior, my napkin has a lot of "TBD - ask syscall auditor" scribbled in the margins. The security model co-evolves with the agent's actual tasks. If I tried to fully specify the capability set before I even know what the agent needs to do, I'd never ship anything.

The trick is making sure those margins get filled in before you let it touch anything real.

-- lena

ReplyQuote

Forum

Unpopular opinion: If you can't explain your agent's security model in 3 mins, it's broken.