A common misconception in our architecture discussions is that the model backend, given its primary role in tensor operations, presents a negligible attack surface from a syscall perspective. This is demonstrably false. A compromised model inference process, through a maliciously crafted payload or a supply-chain attack on a framework like PyTorch, can leverage a vast array of syscalls to establish persistence, exfiltrate data, or pivot to the host. The isolation between the orchestrator and the backend is only as strong as the kernel-level constraints we impose.
In OpenClaw's trust boundary model, the backend is the most permissive component by necessity—it requires GPU access, high-resolution timers, and substantial memory mapping capabilities. However, "permissive" must not mean "unconstrained." The goal of seccomp-bpf here is not to achieve a no-new-privileges, hermetic seal (impossible for this workload), but to surgically remove avenues for *lateral movement* and *scripting engine activation*. We focus on blocking process creation, namespace manipulation, and network socket families not strictly required for IPC with the orchestrator.
Below is a baseline seccomp-bpf profile, written as a JSON structure for `libseccomp`, that we apply to our backend processes. It is a deny-list approach on top of a default `SCMP_ACT_ALLOW`, which is the inverse of our more restrictive orchestrator policy.
```json
{
"defaultAction": "SCMP_ACT_ALLOW",
"architectures": ["SCMP_ARCH_X86_64"],
"syscalls": [
{
"names": [
"clone", "fork", "vfork", "execve", "execveat",
"open", "openat", "creat",
"connect", "socket", "socketpair", "accept", "bind", "listen",
"unshare", "setns", "pivot_root",
"mount", "umount", "umount2",
"ptrace", "kcmp",
"swapon", "swapoff",
"sethostname", "setdomainname"
],
"action": "SCMP_ACT_ERRNO",
"args": [],
"comment": "Block process, network, namespace, and host manipulation."
},
{
"names": [
"open", "openat"
],
"action": "SCMP_ACT_ALLOW",
"args": [
{
"index": 1,
"value": 0,
"op": "SCMP_CMP_MASKED_EQ",
"valueTwo": 0
}
],
"comment": "Allow open/openat only if O_RDONLY flag is set (read-only)."
}
]
}
```
Key points of this configuration:
* **Process Creation Blocked:** `clone`, `fork`, `execve` family are denied. The backend cannot spawn shells or child processes.
* **Network Isolation:** Only Unix domain sockets for IPC are permissible; we block the `socket` syscall for families like `AF_INET`/`AF_INET6`.
* **Namespace Containment:** `unshare`, `setns`, `pivot_root` are prohibited, preventing escape from its assigned mount and UTS namespaces.
* **File Access Control:** The second rule is critical. It uses argument filtering to allow `open`/`openat` *only* if the `O_RDONLY` flag is set. This prevents the backend from opening files for writing, drastically reducing its ability to modify configuration, logs, or drop payloads. This is a simple example; in production, you would extend this with a list of allowed paths.
Applying this profile is done via the `seccomp` syscall after `unshare(CLONE_NEWUSER | CLONE_NEWPID)` but before executing the model runtime. The orchestrator handles this via the `runc` spec, but for custom integrations, the code path is straightforward. The major challenge is testing: you must profile the exact syscalls your specific ML framework requires during initialization, inference, and cleanup. Tools like `strace` or `scmp_sys_resolver` are indispensable here.
Failure modes occur when the profile is too restrictive, causing the backend to crash on a legitimate syscall (e.g., an obscure `ioctl` for GPU memory management), or too permissive, leaving a door open. The balance is empirical. Remember, this filter is a *layer*. It must be combined with a dedicated user namespace, cgroups, and mount namespaces to form a meaningful trust boundary. A broken seccomp policy alone will not contain a determined adversary, but its absence makes containment virtually impossible.
--av
--av