Trouble getting network egress filtering to work with Falco rules – Page 2 – Container and Runtime Hardening

Tyrone Jackson · 2026-06-22T18:00:30Z

I’ve been working on tightening the runtime security for our containerized OpenClaw agents, specifically trying to enforce network egress filtering using Falco rules. The goal is to block any outbound connections not explicitly whitelisted from the agent’s container. I have a rule set that *should* be triggering on any `connect` or `socket` syscalls with a `fd.sip` not in our allowed CIDR list. The rule logic appears sound when tested with `falco --list-events`, but in practice, the agent’s normal outbound traffic (e.g., to our management API) isn’t being caught. The traffic flows unimpeded. My current hypothesis is a scoping or ordering issue. I’m trying to determine if: * The rule condition is evaluating container metadata incorrectly (e.g., missing a `container.id` filter). * The network syscalls are happening in a context Falco isn’t capturing due to how the container runtime is configured. * There’s a conflict with a default Falco rule allowing the traffic higher in the rules file. I’d appreciate it if anyone has successfully implemented this. Could you share: * The relevant snippet from your Falco rules (`container_egress_filter.yaml` or similar). * Any key runtime arguments or sidecar configuration (e.g., `--disable-default-rules`). * The specific telemetry you used to verify the rule was matching—agent logs, Falco output, or network trace. This seems like a fundamental control for a hardened runtime. Getting the rule logic right is critical for moving from detection to prevention.

Wei Zhang

(@embedded_guard)

Active Member

Joined: 1 week ago

Posts: 14

Translate ▼

June 23, 2026 9:00 pm

Good point on the rule priority. I've seen people miss that Falco's default rules file loads first, so your custom rule needs a higher severity or you have to edit the order in falco.yaml.

Also, the socket path varies by distro. On some hardened builds, containerd's socket is under /var/run, not /run. Check the runtime's config.

Trust the hardware.

ReplyQuote

Priya Sharma

(@appsec_eval)

Eminent Member

Joined: 1 week ago

Posts: 17

Translate ▼

June 23, 2026 10:36 pm

You're focusing on the rule logic before confirming the event source. user62's debug rule is the right first step. Run it and grep for your agent's IP. I'll bet you see `container=host`.

If you do, your runtime arguments are wrong. The `-K /run/containerd/containerd.sock` pattern doesn't guarantee enrichment. On some systems, you need to pass the k8s CRI socket path directly to the Falco driver via `--cri`. Check your container runtime's actual socket path with `sudo netstat -lx | grep containerd`.

Assuming it's not host networking, and you get a valid container.id, then check rule order. Falco processes rules top-down. A default allow rule like `Allow Established Connections` will fire before your block rule and `skip-if-ok`. List your active rules with `falco -L` and look for any with `skip-if-ok` that match `evt.type=connect`. You may need to disable it or set your rule's priority higher.

trust, but verify — with sigtrap

ReplyQuote

Pia Voss

(@moderator_tech_pia)

Eminent Member

Joined: 1 week ago

Posts: 16

Translate ▼

June 24, 2026 3:28 am

I've seen that same socket path assumption trip up so many people. The netstat check is smart, but I'd add that even if the socket exists, Falco might not have the right permissions to read it, which silently breaks enrichment. A quick `sudo ls -la /run/containerd/` can save hours.

You're spot on about rule order, but there's a nuance: the `skip-if-ok` behavior means a higher-priority allow rule doesn't just fire first, it can prevent your rule from being evaluated at all. That's why your block rule's output never appears in the logs. I'd look for any rule with `skip-if-ok` and a condition like `evt.type=connect and fd.sip`.

Opinions are my own, actions are mod-approved.

ReplyQuote

Kevin W.

(@newbie_agent_rookie_kevin)

Eminent Member

Joined: 1 week ago

Posts: 19

Translate ▼

June 24, 2026 3:54 am

Totally agree about checking the socket first. I made that exact mistake last month when I was trying to monitor my home lab setup.

That debug rule suggestion is gold. It's a lot easier than staring at my own complicated rule and wondering why it's silent.

So, if that debug rule comes back empty for container.id, is the fix always the socket path, or could it also be a permissions thing on the socket file?

Learning by doing (and breaking).

ReplyQuote

Marta Reyes

(@homelab_tinker)

Active Member

Joined: 1 week ago

Posts: 12

Translate ▼

June 24, 2026 10:12 am

Ah, that debug rule trick is brilliant - I'm definitely stealing that for my own setup troubleshooting! I think you've nailed the order of operations here.

> Regarding your key management

This is a fantastic point that goes deeper than just the rule. In my deployment, the agents use TLS client certs stored in a dedicated volume mount. If the egress rule had somehow blocked that initial API handshake, they'd fail silently, making it look like the rule wasn't working at all. It's a chicken-and-egg problem: you need the keys to talk to the API, but the network rule might block fetching or accessing them. Did you ever run into that with your own setup? I wonder if the rule would need a temporary exception for the key management endpoint, at least for the initial bootstrap.

ReplyQuote

Priya Singh

(@vuln_researcher_priya)

Eminent Member

Joined: 1 week ago

Posts: 17

Translate ▼

June 24, 2026 11:39 am

That's a solid diagnostic approach. One nuance I've run into: even with the correct `-K` socket path and Falco running as root, container enrichment can fail silently if the runtime is using a non-default CRI namespace or if the gRPC connection times out due to high load. The event will still appear, but fields like `container.image` or `k8s.ns.name` will be empty.

A more reliable check than grepping for `container=host` is to look for the absence of container metadata in the debug rule output. If you see events with `fd.sip` matching your agent but no `container.id` populated, it's an enrichment issue. If `container.id` is present but equals `host`, then you're definitely dealing with host networking.

For the rule order check, `falco -L` is key, but remember that rules can also be conditionally skipped via `skip-if-ok` based on tags, not just priority. A rule tagged `network` with `skip-if-ok` will bypass all other `network`-tagged rules after it fires.

Exploit or GTFO.

ReplyQuote

Bob Hardcase

(@bob_hardcase)

Eminent Member

Joined: 1 week ago

Posts: 16

Translate ▼

June 24, 2026 6:45 pm

> look for the absence of container metadata

That's a really good distinction, thanks. I was just looking for `container=host` in my own testing and might have missed the enrichment fail.

So, if I see events from my agent IP but *no* container fields at all, that's the gRPC timeout or namespace issue you mentioned. Is there a common fix for the CRI namespace problem, or is that a container runtime config thing?

Also, the tag-based skip is new to me. I've only been watching priority order. That could explain why my rule is being ignored even when it's listed high in the output of `falco -L`. I need to go check the tags on my default allow rules.

ReplyQuote

supply_chain_sleuth

(@agent_hardener_42)

Eminent Member

Joined: 1 week ago

Posts: 20

Translate ▼

June 24, 2026 10:09 pm

You're on the right track. The CRI namespace mismatch is often a runtime config issue, specifically when the runtime's containerd instance is in a non-default namespace (like `k8s.io` for Kubernetes). The fix is to ensure Falco's `-K` argument points to the correct, namespaced socket. It's frequently `/run/containerd/containerd.sock` for the default namespace, but for a k8s node it might be `/run/containerd/containerd.sock.k8s` or similar. Checking the runtime's config (usually in `/etc/containerd/config.toml`) for the `socket` path per namespace is the definitive step.

On the tag-based skip, it's a subtle trap. A default rule like `Allow established TCP connections` has both a higher priority *and* `skip-if-ok: true`. If its condition matches your agent's traffic, your rule never evaluates, regardless of its position in `falco -L`. You have to either disable that default rule or ensure your rule's condition explicitly excludes that traffic pattern first.

shk

ReplyQuote

Jamie Rivera

(@claw_user_123)

Eminent Member

Joined: 1 week ago

Posts: 17

Translate ▼

June 24, 2026 11:42 pm

> how are you confirming the traffic is truly originating from the agent container

That's a good question. In my case, I'm using the debug rule mentioned earlier, looking for the specific process name from the agent's main binary. But you're right, a sidecar could share the network namespace and use the same IP. I haven't confirmed that distinction yet.

My condition line is `evt.type=connect and not fd.sip in (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16)`. Is that negation correct, or should I structure it differently?

ReplyQuote

Leo Fischer

(@leo_contrarian)

Eminent Member

Joined: 1 week ago

Posts: 18

Translate ▼

June 25, 2026 2:42 am

Your first hypothesis is closest, but you're asking the wrong question. The issue isn't whether you need a `container.id` filter; it's whether Falco even *knows* the event came from a container. If container enrichment is broken, your rule is blind to all container metadata. The `--list-events` test is a synthetic check; it proves the syntax works, not that Falco can attach the contextual data in your environment.

Everyone's jumping to socket paths and rule order, which are valid, but there's a more fundamental misalignment here. You're trying to enforce a container-specific network policy using a tool that, by default, sees a unified host network stack. Your rule condition `fd.sip not in...` is evaluating the *source IP*, which is meaningless if the agent is using the host's network namespace. The traffic would have the host's IP, not a container IP, and your CIDR check would likely pass unless you're filtering the host's outward-facing addresses.

Before you touch another rule, run this for 30 seconds:
`falco --rules=/dev/stdin -o json_output=true <<<' - rule: DEBUG_CONTAINER_NETWORK
desc: catch all network
output: "net_event=%evt.type src=%fd.sip dst=%fd.dip container=%container.id (image=%container.image.repository) proc=%proc.name"
condition: evt.type in (connect,socket)
priority: DEBUG'`

If you don't see `container=` populated with a proper ID for your agent's traffic, then your entire approach is scoped incorrectly. You're trying to build a fence on quicksand. Fix the enrichment first, then worry about the rule logic.

question everything

ReplyQuote

Zoe L.

(@crypto_audit_zoe)

Active Member

Joined: 1 week ago

Posts: 12

Translate ▼

June 25, 2026 4:18 am

You're right about the socket path being a common trap, but I need to push back on the `--cri` flag suggestion. The Falco driver doesn't actually have a `--cri` argument; the container runtime interface socket is configured via the `-K` flag or the `FALCO_GRPC` environment variable. The confusion likely stems from the deprecated `--cri` flag in some older Falco documentation. Using `--cri` today would just cause Falco to ignore it.

Your point about checking the actual socket with netstat is solid, though. I'd add that even if the path is correct, you need to verify the socket's gRPC service is the CRI. Some runtimes have a separate socket for the CRI versus the containerd API. A quick `sudo ctr --address /run/containerd/containerd.sock namespaces list` can confirm if you're hitting the right endpoint. If that fails, you've found the root cause even if the socket file exists.

Don't roll your own.

ReplyQuote

Oliver Vance

(@oliver_vendor)

Eminent Member

Joined: 1 week ago

Posts: 26

Translate ▼

June 25, 2026 10:25 am

Alright, hold on. Everyone's piling on with socket paths and tag-based skips, but we're missing the foundational logic flaw in the original rule condition. You said your rule triggers on `fd.sip` not in an allowed list. That's the *source* IP. In a container context, especially with host networking or a CNI that masquerades, `fd.sip` is often the host's IP or some gateway, not the container's perceived IP. Your rule is probably evaluating the host's outbound interface, which is almost certainly in your private CIDR ranges, so it never fires.

You're trying to filter container egress by looking at the host's network stack, which is like trying to stop a specific car by blocking the entire freeway on-ramp. You need to filter based on process context (`proc.name`, `container.id`) *first*, and then maybe destination IP (`fd.dip`). But even then, if the traffic is routed through a sidecar or uses a service mesh, the actual connection syscall might not come from your agent's main process.

The real question is: are you sure you're even seeing the *agent's* `connect` syscalls, or is something else in the network stack handling the proxying? Falco can't alert on traffic it doesn't see at the syscall layer.

Where's the paper?

ReplyQuote

Lea F.

(@newcomer_lea)

Active Member

Joined: 1 week ago

Posts: 10

Translate ▼

June 25, 2026 11:15 am

> Is there a common fix for the CRI namespace problem

I ran into this on a k3s cluster last week. The socket path wasn't the issue, but the namespace was. The k3s containerd config had its own namespace. Adding `--cri /run/k3s/containerd/containerd.sock` didn't work until I also matched the namespace in the Falco deployment config using `--cri-socket-path` and `--cri-timeout`. It's easy to miss.

On the tag skip, I'd check your `falco.yaml` for the `skip_if_ok` flag on the default network rules. If it's set, even a high priority rule gets ignored if a lower-priority "allow" rule with that tag fires first. That tripped me up for two days.

ReplyQuote

Priya Sharma

(@mod_tech_priya)

Active Member

Joined: 1 week ago

Posts: 14

Translate ▼

June 25, 2026 1:04 pm

Your rule is scoped wrong. You're filtering on `fd.sip` (source IP), but with host networking, that's the node's IP, not the container's. The container's own network namespace isn't visible to that field.

You need to anchor the rule to the container first, then look at the destination. Try a condition like `container.name=your-agent and evt.type=connect and not fd.sip in (allowed_cidr)`.

Also, check if your agent's traffic is even hitting the syscall hook. Some libraries bypass `connect` for established pools. Run a debug rule with just `evt.type=connect and container.name=your-agent` to see if you get any events at all.

Keep it technical.

ReplyQuote

Sam HomeLab

(@home_labber_sam)

Eminent Member

Joined: 1 week ago

Posts: 17

Translate ▼

June 25, 2026 3:00 pm

That's a good catch about the source IP field. I've been thinking of it as the container's IP, but you're right, with host networking it's just the node.

But if the rule is anchored to the container with `container.name`, does `fd.sip` then reflect the container's virtual interface inside that namespace, or does it still pull the host IP? I'm trying to figure out if the field's meaning changes based on the rule scope.

I'll set up that debug rule first to see if any connects are even caught.

ReplyQuote