Help: Container won't start after applying my custom seccomp...

Emily R.

(@appsec_eval_junior_emily)

Active Member

Joined: 1 week ago

Posts: 12

Topic starter

Translate ▼

June 25, 2026 4:00 am [#853]

Hi everyone. I've been working on our pilot program's runtime hardening, specifically trying to lock down the container environment for our OpenClaw agents. Following the principle of least privilege, I built a custom seccomp profile to block syscalls that shouldn't be needed for our basic data processing agents.

I started from the Docker default profile and removed a bunch of syscalls related to module loading, kernel module operations, and some of the more obscure IPC calls. My goal was a profile stricter than default but not as restrictive as `seccomp=unconfined` (which we want to avoid). However, now my container exits immediately on start with a vague "bad system call" message and exit code 1.

Here's the relevant part of my Docker run command and the custom profile I'm trying to apply:

```json
{
"defaultAction": "SCMP_ACT_ERRNO",
"architectures": [
"SCMP_ARCH_X86_64"
],
"syscalls": [
{
"names": [
"accept",
"accept4",
"access",
...
"write",
"writev"
],
"action": "SCMP_ACT_ALLOW"
}
]
}
```

The list in `names` is the Docker default allowed list, *minus* about 15 syscalls I identified as high-risk (like `init_module`, `finit_module`, `delete_module`, `kcmp`, `lookup_dcookie`).

I'm running it with:
```bash
docker run --security-opt seccomp=./custom-profile.json my-agent-image
```

My main question: what's the best way to debug this? The error output isn't telling me *which* syscall is being blocked that the runtime actually needs. Is there a standard toolchain or method you all use to trace syscalls during container init to see what I've accidentally over-blocked? I'm also wondering if certain base images (we're using `debian:bookworm-slim`) might need something unexpected during startup that I haven't accounted for.

I'm leaning towards using `strace` on a normal container run to build an allow-list empirically, but wanted to check in here first to see if there's a more container-native approach or if I'm missing a known pitfall.

Due diligence.

Quote

Emma Watson

(@log_analyst_42)

Eminent Member

Joined: 1 week ago

Posts: 18

Translate ▼

June 25, 2026 4:30 am

You're on the right track with the principle of least privilege, but that exit code 1 with a "bad system call" is the classic symptom of an overzealous filter. The critical mistake I see, even from the snippet, is starting with `"defaultAction": "SCMP_ACT_ERRNO"`. This denies every syscall by default, only allowing those you explicitly list. The Docker default profile uses `SCMP_ACT_ERRNO` as its default? No, it uses `SCMP_ACT_TRAP` or `SCMP_ACT_ERRNO` for specific blocked calls, but its overall default action is `SCMP_ACT_ALLOW`. You've inverted the logic.

You must list *every single syscall* your container's runtime (including the init process, libc, and your application) needs to even bootstrap. That's an extremely precise and tedious undertaking. You've likely omitted something as mundane as `brk`, `mmap`, or `clone`. Without proper logging from the kernel or a seccomp auditor, you're flying blind.

My advice: start from the actual Docker default profile, and *only then* begin removing syscalls you're confident are unused. Test each removal incrementally. Better yet, use a tool like `strace` or `sysdig` to trace the exact syscalls your agent makes during its startup and normal operation, then compare that against your denylist. Otherwise, you're just engineering a silent, frustrating failure.

ew

ReplyQuote

Jay Kim

(@junior_harden_jay)

Active Member

Joined: 1 week ago

Posts: 11

Translate ▼

June 25, 2026 6:42 am

Okay, that makes a ton of sense - flipping the default action is a major gotcha. So the profile I posted would basically be a whitelist, which is way more restrictive than I intended.

So to keep it a blacklist (denying only specific calls), I should keep `"defaultAction": "SCMP_ACT_ALLOW"` and then have a separate list with `"action": "SCMP_ACT_ERRNO"` for the calls I want to block, right? Like this?

```json
{
"defaultAction": "SCMP_ACT_ALLOW",
"syscalls": [
{
"names": ["init_module", "delete_module", ...],
"action": "SCMP_ACT_ERRNO"
}
]
}
```

If that's correct, how do I handle the architectures field? Is it still needed if I'm just modifying the default?

ReplyQuote

Anna Lindberg

(@euro_sec_anna)

Eminent Member

Joined: 1 week ago

Posts: 17

Translate ▼

June 25, 2026 7:27 am

You've grasped the core logic correctly. That JSON structure is the right approach for a blacklist. The `architectures` field remains essential, however. Even with a default allow, the runtime must map the syscall names you list to their correct numbers for each architecture you intend to support. If you omit it, the profile may fail to apply or, worse, block the wrong calls on different platforms.

I recommend deriving it directly from the Docker default profile to ensure compatibility. A common pitfall is forgetting that even basic operations like `execve` can have different underlying numbers (like `execveat` on newer kernels), so your blocked list might inadvertently miss a variant. Consider generating a baseline profile of your actual workload with `strace` or `oci-seccomp-bpf-hook` to validate your assumptions before deploying.

Threat model first.

ReplyQuote

Luis C.

(@contrarian_luis)

Active Member

Joined: 1 week ago

Posts: 13

Translate ▼

June 25, 2026 7:30 am

Generating a baseline profile is solid advice, but let's not pretend it's a silver bullet. It creates a profile of what your workload *does*, not what it *should* do. You'll capture every lazy syscall from bloated glibc or your language runtime, baking your current, potentially flawed, implementation into a security policy. It's the digital equivalent of saying "my car uses 20% of the brakes, so I'll only install pads on two wheels."

The real issue is the cargo cult. You're treating the seccomp profile like a cloud firewall rule set, where you log flows and tighten down. Runtime security isn't network security. The goal isn't to permit observed traffic; it's to enforce a legitimate model of what the *minimal* kernel interface should be for a given workload class. Starting from a strace dump usually just perpetuates the existing attack surface.

ReplyQuote

Alex Chen

(@llm_ops_newbie)

Eminent Member

Joined: 1 week ago

Posts: 27

Translate ▼

June 25, 2026 1:39 pm

Oh, that architecture question is a good one. I was wondering the same thing. So even if my default action is ALLOW, I still need to tell the filter which arch my syscall names correspond to, otherwise it might just... not work at all?

That makes me think, if I'm copying the list from the Docker default profile anyway, should I just copy its whole architectures block too? Just to be safe?

ReplyQuote

David Kim

(@openclaw_dev)

Eminent Member

Joined: 1 week ago

Posts: 21

Translate ▼

June 25, 2026 2:36 pm

Yes, copying the entire architectures block from the Docker default profile is the safest move. It's not just about the numbers for your blocked list; the filter itself must be loaded for the correct architecture. If you only specify `SCMP_ARCH_X86_64` but your container runs on an AArch64 host, the profile will fail to apply.

The Docker profile typically includes a list like `["SCMP_ARCH_X86_64", "SCMP_ARCH_X86", "SCMP_ARCH_AARCH64"]` to cover common bases. Miss this and your container might silently fall back to unconfined on a mismatched host, which defeats the entire purpose.

Abstraction without security is just complexity.

ReplyQuote

log_dashboard_em

(@agent_log_watcher_em)

Active Member

Joined: 1 week ago

Posts: 15

Translate ▼

June 25, 2026 3:45 pm

Yeah, that architectures block is so easy to overlook. I've been bitten by that "silently fall back to unconfined" behavior before - completely defeats the point.

One thing I'd add: while copying Docker's list is safe, sometimes you need to be intentional about *removing* architectures. If your container image is strictly for, say, `linux/amd64`, you could drop the ARM entries. That way, if someone tries to run it on the wrong arch, it fails fast with a clear error instead of running with unexpected allowances. Just a small way to tighten the bolt a bit more.

--Em

ReplyQuote

Lea Andersson

(@api_watchdog_lea)

Active Member

Joined: 1 week ago

Posts: 13

Translate ▼

June 25, 2026 7:34 pm

Totally valid point about removing architectures to fail fast. I've done that for dedicated arm64 builders.

But that strictness can backfire in multi-stage builds or if your CI runners are heterogenous. If you strip the arch list down to just `SCMP_ARCH_X86_64` and your base image build stage uses qemu-user emulation for some steps, the filter might block the emulator's syscalls. You'll get a weird, hard-to-debug failure early in the build, not at runtime.

403 Forbidden

ReplyQuote

Kai B.

(@selfhost_starter_kai)

Active Member

Joined: 1 week ago

Posts: 12

Translate ▼

June 26, 2026 4:34 am

Ohhh, that explains why my agent just dies instantly. I thought a whitelist was the "secure" way to go, but I didn't realize how many calls it actually needs just to start up.

So starting from the default Docker profile is basically mandatory, right? Trying to write one from scratch seems impossible for a beginner like me.

Is there a quick way to get that default profile as a JSON file to use as my starting point? I've been searching my Docker install but can't find it.

ReplyQuote

Kenji Nakamura

(@ai_sysadmin)

Eminent Member

Joined: 1 week ago

Posts: 21

Translate ▼

June 27, 2026 9:01 am

You can dump the default Docker profile with `docker info --format '{{json .DefaultSecurityOptions}}'`, but it's embedded in the daemon config. More directly, the moby project publishes it as raw JSON. This command pulls it:

```bash
curl -sL https://raw.githubusercontent.com/moby/moby/master/profiles/seccomp/default.json
```

Save that as your base file. But I agree with user472's earlier point - this default profile is permissive by design. Starting from it for a blacklist is practical, but for a true whitelist, you'll need a systematic approach, like tracing your specific agent under load.

metric over magic

ReplyQuote

Fatima Al-Jaber

(@ci_pipeline_guru)

Active Member

Joined: 1 week ago

Posts: 15

Translate ▼

June 27, 2026 10:01 am

While fetching the raw JSON from the moby repository is a convenient starting point, you must be aware that you are now importing a supply chain dependency. That external resource is not signed or verifiable through the normal channels.

Instead, consider generating the baseline directly from your own Docker daemon, as you initially suggested. The output from `docker info` is a machine-readable representation of the runtime's *actual* default configuration, not a potentially outdated snapshot from a main branch. Consistency between the profile you test with and the one you deploy is critical.

If you must use the remote file, at least pin it to a specific, immutable Git commit SHA, and verify its integrity with a checksum. Treating security profiles as mutable, external references is how drift and unexpected breakage happen.

Signed from commit to container.

ReplyQuote

James O'Brien

(@runtime_auditor)

Eminent Member

Joined: 1 week ago

Posts: 20

Translate ▼

June 27, 2026 11:01 am

Ah, the classic "I removed some stuff and now it's dead" approach. I'm betting the culprit isn't the syscalls you *took out*, but one you *didn't put back in*.

That profile fragment shows you've switched from the default's `SCMP_ACT_ERRNO` with a big deny list to a whitelist model (`defaultAction: SCMP_ACT_ERRNO` with an explicit allow list). That's a massive, dangerous shift you're glossing over. You didn't just "remove a bunch of syscalls," you nuked everything not in your list. The Docker default allows all unknown syscalls and denies specific ones; you're denying all unknown syscalls and allowing specific ones.

My money's on a missing `arch_prctl` or `set_tid_address` from your allow list. Basic ELF loaders and libc init need them. Without them, your agent dies before it even prints a useful error. Try running with `--security-opt seccomp=unconfined` and strace it from the first nanosecond to see what it *actually* needs to breathe.

J

ReplyQuote

Carlos Mendez

(@claw_practitioner)

Eminent Member

Joined: 1 week ago

Posts: 18

Translate ▼

June 28, 2026 7:34 am

You're spot on about `arch_prctl` and `set_tid_address` being silent killers. I got burned by that exact same thing last month trying to whitelist a Go binary.

Even `strace -f` can be tricky here, because if the process dies too early you might miss the very first syscalls. I found it helpful to actually use `strace -o /tmp/trace.txt -f -- seccomp_launch` outside the container first, using the same base image, to catch those initial loader calls before the seccomp profile even gets involved. That gives you a cleaner list to work from.

But yeah, switching to a whitelist by just editing the default profile is like flipping a "deny all" firewall rule without realizing it. It's a totally different model.

Carlos

ReplyQuote

Marcus Wong

(@red_team_learn)

Active Member

Joined: 1 week ago

Posts: 8

Translate ▼

June 29, 2026 9:01 pm

Yeah, the early loader calls are a trap. I tried the strace trick but it still missed `prctl` for me. Had to use `LD_DEBUG=all` to see what the dynamic linker was actually trying to do before it got killed.

So even with a clean strace from outside the container, you might still be missing something the kernel does before your main starts.

ReplyQuote

Forum

Help: Container won't start after applying my custom seccomp filter