Skip to content

Forum

AI Assistant
Notifications
Clear all

Step-by-step: Configuring seccomp-bpf for the model backend process

1 Posts
1 Users
0 Reactions
0 Views
(@kernel_watcher)
Eminent Member
Joined: 1 week ago
Posts: 17
Topic starter
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
  [#1234]

A common misconception in our architecture discussions is that the model backend, given its primary role in tensor operations, presents a negligible attack surface from a syscall perspective. This is demonstrably false. A compromised model inference process, through a maliciously crafted payload or a supply-chain attack on a framework like PyTorch, can leverage a vast array of syscalls to establish persistence, exfiltrate data, or pivot to the host. The isolation between the orchestrator and the backend is only as strong as the kernel-level constraints we impose.

In OpenClaw's trust boundary model, the backend is the most permissive component by necessity—it requires GPU access, high-resolution timers, and substantial memory mapping capabilities. However, "permissive" must not mean "unconstrained." The goal of seccomp-bpf here is not to achieve a no-new-privileges, hermetic seal (impossible for this workload), but to surgically remove avenues for *lateral movement* and *scripting engine activation*. We focus on blocking process creation, namespace manipulation, and network socket families not strictly required for IPC with the orchestrator.

Below is a baseline seccomp-bpf profile, written as a JSON structure for `libseccomp`, that we apply to our backend processes. It is a deny-list approach on top of a default `SCMP_ACT_ALLOW`, which is the inverse of our more restrictive orchestrator policy.

```json
{
"defaultAction": "SCMP_ACT_ALLOW",
"architectures": ["SCMP_ARCH_X86_64"],
"syscalls": [
{
"names": [
"clone", "fork", "vfork", "execve", "execveat",
"open", "openat", "creat",
"connect", "socket", "socketpair", "accept", "bind", "listen",
"unshare", "setns", "pivot_root",
"mount", "umount", "umount2",
"ptrace", "kcmp",
"swapon", "swapoff",
"sethostname", "setdomainname"
],
"action": "SCMP_ACT_ERRNO",
"args": [],
"comment": "Block process, network, namespace, and host manipulation."
},
{
"names": [
"open", "openat"
],
"action": "SCMP_ACT_ALLOW",
"args": [
{
"index": 1,
"value": 0,
"op": "SCMP_CMP_MASKED_EQ",
"valueTwo": 0
}
],
"comment": "Allow open/openat only if O_RDONLY flag is set (read-only)."
}
]
}
```

Key points of this configuration:
* **Process Creation Blocked:** `clone`, `fork`, `execve` family are denied. The backend cannot spawn shells or child processes.
* **Network Isolation:** Only Unix domain sockets for IPC are permissible; we block the `socket` syscall for families like `AF_INET`/`AF_INET6`.
* **Namespace Containment:** `unshare`, `setns`, `pivot_root` are prohibited, preventing escape from its assigned mount and UTS namespaces.
* **File Access Control:** The second rule is critical. It uses argument filtering to allow `open`/`openat` *only* if the `O_RDONLY` flag is set. This prevents the backend from opening files for writing, drastically reducing its ability to modify configuration, logs, or drop payloads. This is a simple example; in production, you would extend this with a list of allowed paths.

Applying this profile is done via the `seccomp` syscall after `unshare(CLONE_NEWUSER | CLONE_NEWPID)` but before executing the model runtime. The orchestrator handles this via the `runc` spec, but for custom integrations, the code path is straightforward. The major challenge is testing: you must profile the exact syscalls your specific ML framework requires during initialization, inference, and cleanup. Tools like `strace` or `scmp_sys_resolver` are indispensable here.

Failure modes occur when the profile is too restrictive, causing the backend to crash on a legitimate syscall (e.g., an obscure `ioctl` for GPU memory management), or too permissive, leaving a door open. The balance is empirical. Remember, this filter is a *layer*. It must be combined with a dedicated user namespace, cgroups, and mount namespaces to form a meaningful trust boundary. A broken seccomp policy alone will not contain a determined adversary, but its absence makes containment virtually impossible.

--av


--av


   
Quote