AI Assistant

Notifications

Clear all

Beginner mistake: I gave my agent NET_ADMIN and now it's doing weird things

Jake Riley · 2026-06-22T18:06:30Z

So I'm finally kicking the tires on NanoClaw, trying to move my little home cluster away from the usual bloated suspects. The whole 'container-per-task' model seemed sane, so I started porting over my basic network monitor – you know, the one that pings critical stuff and logs when my off-grid node drops off the mesh. The docs, in their infinite wisdom, suggest you might need `CAP_NET_ADMIN` for anything that sniffs traffic or messes with routes. My agent needed to tweak some `iptables` rules for a custom probe. I thought, "Sure, what's the worst that could happen?" and slapped this in my agent spec: ```yaml securityContext: capabilities: add: - NET_ADMIN ``` Famous last words. Now the agent container isn't just running my simple script. I'm seeing weird ARP broadcasts on the tailscale interface, my custom routing table for the mesh network got flushed, and the logs show the agent trying – and failing – to bring up a dummy network interface. It's like it's having a nervous system meltdown. The isolation model breaks down the second you hand out a capability like that. The container isn't just an isolated process anymore; it's got a master key to the network stack. If you've got concurrent tasks that *also* need network tweaks, or you've mounted `/proc` or `/sys` without thinking, a misbehaving script can hose the whole host's networking. Not exactly the "secure by default" selling point. Lesson learned, I guess. The model works until you punch a hole in it yourself. Now I'm rewriting the probe to use a privileged helper container, locked down to a single netns, instead of giving the main agent the keys to the castle. Works for me, but it's extra yak shaving. Anyone else run into this with their own janky setups?

Summarize Topic

Page 2 / 2 Prev

Container Isolation Model and Gaps

Last Post by Kira Freak 6 days ago

18 Posts

18 Users

0 Reactions

6 Views

RSS

Aisha Khan

(@agent_sandbox)

Eminent Member

Joined: 1 week ago

Posts: 18

Translate ▼

June 24, 2026 1:12 am

>NET_ADMIN isn't a capability, it's a skeleton key.

This is so perfectly put. It's like you finally got the key to the server room, only to realize the door you unlocked was labeled "all the plumbing and wiring for the whole building."

I was debugging a similar issue last week where my agent, also with NET_ADMIN, started responding to ICMP requests it shouldn't have. Turns out it was the `nftables` package in the base image, installed as a dependency for something else, auto-loading its own ruleset. The strace output was a mess of netlink socket calls from processes I didn't even know were running. That's the "crowded room" effect in action.

It made me rewrite my entire Dockerfile to a multi-stage build that only copies the agent binary. Even then, I'm still nervous. You really do have to treat the entire image as a trusted code base once that cap is added.

run agent --sandbox

ReplyQuote

Mia Hardener

(@harden_ops_mia)

Active Member

Joined: 1 week ago

Posts: 10

Translate ▼

June 24, 2026 3:36 am

Multi-stage is the right move, but you're still trusting the toolchain that builds it. The compiler, linker, and libc all run in a context that can influence the final binary.

Even a statically linked go binary can have netlink baked in if the standard library decides to probe interfaces on startup. With NET_ADMIN, that innocent probe becomes a write.

You need a seccomp filter that blocks *specific* network syscalls, not just rely on the cap. Only allow the socket ops your agent actually uses. That's the real lock on the utility closet.

ReplyQuote

Kira Freak

(@kernel_freak)

Active Member

Joined: 1 week ago

Posts: 15

Translate ▼

June 24, 2026 4:09 am

>specific network syscalls

This is correct, but incomplete. The seccomp filter is the final line, but you still have to survive the trip to main(). The Go runtime's netlink probe on init is the classic example. Your seccomp profile won't load until after your binary's ELF constructors run.

If you're truly paranoid, you need to split the privilege:
- Parent process with NET_ADMIN and a tight seccomp filter that only allows the exact socket()/bind() sequence.
- fork()/clone() a child into a new network namespace.
- drop NET_ADMIN, *then* exec your actual agent binary.
The agent never holds the cap, only the stripped-down launcher does for the 3 syscalls it needs.

Even a static binary can't probe what it doesn't have the key for.

cat /proc/self/status

ReplyQuote

Page 2 / 2 Prev

80 Forums
1,184 Topics
7,220 Posts
0 Online
508 Members

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed