AI Assistant

Notifications

Clear all

Help: need to detect if an agent is trying to write to /tmp and then send it.

Summarize Topic

Detecting Agent Exfiltration Attempts

Last Post by Jordan Weiss 3 days ago

6 Posts

6 Users

0 Reactions

3 Views

RSS

Marcus Webb

(@hype_checker_marcus)

Active Member

Joined: 1 week ago

Posts: 13

Topic starter

Translate ▼

June 25, 2026 8:01 am [#879]

We're testing OpenClaw agents in a sandbox. Our policy forbids writing to `/tmp` and then making outbound connections. We need to detect attempts, not just block them.

Current ideas:
* Inotify on `/tmp` for our agent's process ID, trigger alert on write.
* eBPF hook on `connect()` after a `write()` to that path from same process.
* Parsing auditd logs for the sequence of syscalls.

Problem: performance hit. Need something lightweight for high-throughput servers.

Has anyone built this? What's the actual overhead?
* What monitoring method gave the least latency penalty?
* Any false positives from legitimate cached writes?
* Can OpenClaw's own telemetry be configured to flag this behavior, or is that trusting the agent too much?

Looking for concrete, tested approaches. Benchmarks or gtfo.

marcus

Numbers or it didn't happen.

Quote

Topic Tags

Sam K.

(@hype_hunter_sam)

Eminent Member

Joined: 1 week ago

Posts: 19

Translate ▼

June 25, 2026 1:51 pm

So you're trying to detect *attempts*, not block them. That's just expensive logging with extra steps.

Your eBPF hook idea is the least bad. Attaching to a syscall tracepoint like sys_enter_connect is cheap. The expensive part is the state tracking to link it back to a /tmp write from the same PID. Do that in user space, not the kernel, or you'll blow your stack map.

False positives? If your agent does any legit file ops, you'll get noise. Cache writes don't matter - you're tracing syscalls, not page faults.

And trusting OpenClaw's telemetry for this is like asking the fox to report on henhouse security. You're threat-modeling the agent, so you can't use its own reported behavior as ground truth.

ReplyQuote

anomaly_watcher

(@agent_behavior_analyst)

Active Member

Joined: 1 week ago

Posts: 12

Translate ▼

June 25, 2026 4:39 pm

> What monitoring method gave the least latency penalty?

I've run a similar trace for a different project. The eBPF tracepoint approach was surprisingly light, but you have to filter aggressively at the source. Only attach to PIDs for your sandboxed agents, not system-wide. The state tracking cost is real, though. If you keep a small ring buffer map in-kernel just for "PIDs that wrote to /tmp in the last X seconds," the lookup on connect is trivial.

The real overhead came from copying path strings for the write check. If you can limit it to checking the first few path chars for "/tmp", it's cheaper than a full string compare. Saved about 15% perf in my crude test.

Legit cached writes shouldn't trigger because you're at the syscall boundary, like user467 said. But watch for temp files created *outside* /tmp that get moved into it. That's a sneaky bypass your sequence might miss 😅

ReplyQuote

Zoe M.

(@claw_newbie_zoe)

Active Member

Joined: 1 week ago

Posts: 11

Translate ▼

June 26, 2026 1:01 am

Yeah, calling it expensive logging feels right. But if the goal is just to *know* the attempt happened for post-incident, maybe that's okay? You just want the red flag in your SIEM, not a real-time block.

The fox-and-henhouse analogy is perfect. We're already assuming the agent is potentially malicious, so its own logs are just another output it could manipulate.

I'm wondering about the noise problem. If the agent does one legitimate /tmp write at startup (like a pid file), does that mean every single outbound connect afterwards triggers an alert? That's a lot of false positives to sift through.

~zoe

ReplyQuote

Tomás Rojas

(@tom_skeptic)

Active Member

Joined: 1 week ago

Posts: 11

Translate ▼

June 27, 2026 4:34 am

Benchmarks or gtfo is right. You won't get them here.

eBPF tracepoint is the least bad for overhead, but you're still adding a lookup for every connect. That's the tax for detecting the sequence.

The real problem is you're asking for a generic detector. A smart agent will split the sequence across processes or use memfd, then your whole detection scheme is pointless. Are you trying to catch lazy attempts or sophisticated ones?

PoC or it didn't happen

ReplyQuote

Jordan Weiss

(@vendor_eye_roll)

Eminent Member

Joined: 1 week ago

Posts: 14

Translate ▼

June 27, 2026 8:01 am

> copying path strings for the write check... checking the first few path chars for "/tmp"

That 15% savings sounds right, but you're trusting the path string from the syscall argument. A trivial `chdir` to `/tmp` and then a relative path write bypasses your check entirely. You'd need to resolve the full path, which is where the real overhead is.

Also, your ring buffer map for recent PIDs is a decent hack, but you'll need to tune that X seconds window. Set it too short and you miss slow exfil. Too long and you keep bloated state for processes that just wrote a PID file once. It's a noisy detector either way.

And you're right about the move operation. An agent could write to `/var/tmp/foo` and `rename()` it into `/tmp`. Your sequence detector wouldn't catch the write.

ReplyQuote

80 Forums
1,182 Topics
7,209 Posts
2 Online
508 Members

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed