A common failure mode in agent-based frameworks is unconstrained outbound network access. This allows compromised or malicious agents to exfiltrate data, join botnets, or pivot attacks. While network namespaces offer coarse isolation, they are often shared across multiple agents within a single sandbox. A more granular, per-process control layer is needed.
eBPF programs attached to `cgroup/sock_create` and `cgroup/connect` hooks can enforce policy at the socket layer, before a connection is established. The following example demonstrates a simple BPF program that denies outbound TCP connections to non-allowlisted IP ranges. It is designed to be attached to the agent's cgroup.
```c
#include
#include
#include
#include
#include
#include
#include
#define ALLOWED_PREFIX_1 0x0A010000 // 10.1.0.0
#define ALLOWED_PREFIX_2 0xC0A80000 // 192.168.0.0
#define PREFIX_MASK 0xFFFF0000 // /16
SEC("cgroup/connect4")
int connect_v4(struct bpf_sock_addr *ctx)
{
__u32 remote_addr = ctx->user_ip4;
__u16 remote_port = ctx->user_port;
// Allow loopback
if (remote_addr == bpf_htonl(INADDR_LOOPBACK))
return 1;
// Check against allowed prefixes
if ((remote_addr & PREFIX_MASK) == ALLOWED_PREFIX_1 ||
(remote_addr & PREFIX_MASK) == ALLOWED_PREFIX_2) {
return 1;
}
// Deny all other outbound TCP connections
bpf_printk("DENY connect to %pI4:%d", &remote_addr, bpf_ntohs(remote_port));
return 0;
}
char _license[] SEC("license") = "GPL";
```
This program must be loaded and attached to the agent's cgroupv2 hierarchy. For deployment, integrate with a system like bpftool or a library like libbpf. The threat model here is a compromised agent attempting to establish a new outbound TCP connection to an unauthorized external endpoint. It does not cover UDP, ICMP, or raw sockets, which would require additional programs. This approach complements, but does not replace, seccomp-bpf syscall filtering.
Interesting approach. The cgroup hook is a good fit for containerized agents. One concern is that your filter only checks IPv4 prefixes. That's fine for internal workloads, but it misses the growing use of IPv6 in data center networks. An agent could bypass this by resolving a domain to an AAAA record.
Also, have you considered the interaction with DNS? A truly comprehensive monitor would need to correlate the socket connect with the prior DNS lookup, perhaps via a `kprobe/tcp_v6_connect` tracepoint, to block based on resolved domain names, not just IPs. Static IP lists get outdated quickly.
You'd also need a userspace controller to dynamically update the allowlist. Hardcoding prefixes in the BPF map means a kernel recompile for every network change. That's not viable for dynamic environments.
trust but verify the hash
Whoa, this is amazing. I've been reading about eBPF but seeing actual code for hooking into `cgroup/connect4` really makes it click for me. I'm still wrapping my head around how you compile and load something like this onto a live system without bringing things down.
One thing I'm immediately worried about, coming from a web dev background: what happens with HTTPS? If the agent needs to connect to an external API, the initial TCP handshake to, say, port 443 would get blocked by this unless its IP is in the allowed prefixes. But that means you'd have to allowlist entire cloud provider IP ranges, which feels... huge and messy. Is there a pattern for letting the TCP connection happen but then inspecting or blocking at the TLS layer from eBPF too? Or is that a completely different beast?
thanks!