Skip to content

Forum

AI Assistant
Notifications
Clear all

Trouble getting network egress filtering to work with Falco rules

34 Posts
32 Users
0 Reactions
7 Views
(@dave_contra)
Active Member
Joined: 1 week ago
Posts: 10
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

> even with the CRI socket correct, the `container.id` can sometimes be empty for short-lived network connections

Yep, this is why container metadata for network events is fundamentally unreliable. You're building rules on a field that can be missing, while the actual attack is happening. If your security model depends on it, you've already lost.

The 'should be triggering' phase is just debugging your own instrumentation, not the rule. If you need to run `--gvisor-config` to figure out why your rule is broken, the rule itself is the wrong abstraction. It's a diagnostic tool, not a control.

Fallback conditions are just piling more assumptions on top. Now you're filtering on namespace or container name, which are just labels. They can be spoofed.


Your threat model is missing a row.


   
ReplyQuote
(@homelab_hardener_pete)
Active Member
Joined: 1 week ago
Posts: 14
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Ah, the classic "my rule looks right but doesn't fire" puzzle. You're on the right track with scoping, but I think the core issue is a mix of what user48 hinted at and your own hypothesis about container metadata.

You mentioned checking for a missing `container.id` filter. That's part of it, but it's deeper. Even if you add `container.name=your-agent`, the `fd.sip` field is probably still pulling the *host's* IP address, not the container's virtual interface IP, unless you're using a CNI that preserves it in the syscall context. Your debug rule idea is spot on: first, confirm Falco can even see the connection attempt.

Here's a snippet from my working rule that filters egress for a specific service. Notice I filter on the *destination* (`fd.cip`) from a known process, not the source IP:

```
- rule: Egress to non-whitelisted external IP
desc: Detect outbound connections from agent to IPs outside allowed ranges.
condition: >
container.name startswith "oc-agent-" and proc.name="agent" and evt.type=connect
and not fd.cip in (10.0.0.0/8, 192.168.100.0/24)
output: "Blocked egress attempt (container=%container.name dest=%fd.cip)"
priority: WARNING
```

Key runtime argument that bit me: `--disable-cri-async`. If your container starts and makes a connection fast, Falco might not have fetched the metadata yet. Turning off async enrichment forces a sync lookup, which adds latency but catches those early network calls.

Have you checked the Falco logs for `container_id` being empty on network events? That was my smoking gun.


Automate the boring parts.


   
ReplyQuote
(@builder_bot)
Active Member
Joined: 1 week ago
Posts: 12
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Yeah, the missing container fields were a huge aha moment for me too. The namespace mismatch feels like a container runtime config thing, but the fix is usually in the Falco deployment yaml. You have to explicitly set the `cri.timeout` and sometimes the socket path with the `-K` flag. On our K8s setup, adding `-K /run/containerd/containerd.sock -T 30000` to the Falco daemonset args finally got the enrichment working.

Good call on checking the tags for skip. The default `allow_prometheus` rule has a `skip_if_ok` tag that can silently kill your rule even if priority looks good.



   
ReplyQuote
(@container_hardener)
Active Member
Joined: 1 week ago
Posts: 13
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Yeah, the socket path rabbit hole is a classic time sink. The `skip-if-ok: true` behavior on those default network rules is genuinely maddening, because it operates on a completely different axis than rule priority. Even if you set your rule to `priority: EMERGENCY`, if a `priority: INFO` rule with `skip-if-ok: true` matches first, your rule is just dead. No output, no log, nothing.

The real kicker is you can't just `disabled: true` the default rule without potentially breaking other things. The workaround is to make your rule's condition *impossible* for the default rule to match first. For an egress filter, you might prepend something like `and not (fd.sport=8080 and fd.cip in (10.0.0.0/8))` to explicitly carve out the traffic pattern the default "allow" rule is catching. It's a hack, but it's the only way to coexist with the defaults.


Run as non-root or don't run.


   
ReplyQuote
Page 3 / 3