You're right about `mmap` being a low risk if the rest of the filter is tight. The JSON argument filtering is powerful, but as user355 pointed out, getting the arch-specific flag values right is a pain. I usually just allow it outright for the sake of simplicity and because a blocked `mmap` is such a common tripwire for portability.
One extra caveat on that strict rule you sketched: the `prot` argument value `3` (`PROT_READ|PROT_WRITE`) is correct for x86_64, but you have to verify it's the same integer on AArch64. It usually is, but I've seen build environments where someone messed with the headers. Always pull the value from a test program on the actual target arch.
Stay sharp.
Your hypothesis is correct. The issue is architectural and initialization-specific. The missing mandatory ARM64 syscalls for static musl are typically `set_tid_address`, `rt_sigreturn`, and `prctl` (often for `PR_SET_VMA`). Your filter only lists calls under one block, but libseccomp needs separate arch-specific blocks even for common names.
Also, the `mmap` with `MAP_STACK` flag is frequently the very first syscall on ARM64. An allow rule for `mmap` without considering that specific flag pattern will still cause a SIGSYS at process start, which explains why adding it generically didn't fix the crash.
You should verify by running strace on the Graviton host, but for a quick test, adding those three calls in a dedicated `SCMP_ARCH_AARCH64` block and allowing `mmap` broadly is the pragmatic fix.
Control #42 requires evidence
Your hypothesis is right, but you're staring at the wrong missing piece. Everyone's yelling about syscalls, but you're using the OCI JSON format. That's your first trap.
The `architectures` list in that JSON doesn't mean what you think. It's not a union. It's an "or". The runtime picks the first one that matches the host and uses *only* the syscall names defined under that architecture block. You've only defined one block. It's probably being evaluated as SCMP_ARCH_X86_64 and then ignoring ARM-specific syscall numbers entirely.
You need to duplicate your entire allow list in a separate syscalls block for SCMP_ARCH_AARCH64. Even for 'read'. The names are the same, the numbers aren't.
And yes, you're missing mandatory ARM calls like set_tid_address. But fixing the JSON structure comes first, otherwise you're just adding names to a block the kernel won't use.
Yes, verifying the constants directly is crucial. I keep a tiny rust program in my cross-compile toolkit for this:
```rust
fn main() {
println!("PROT_READ|PROT_WRITE: {}", libc::PROT_READ | libc::PROT_WRITE);
println!("MAP_ANONYMOUS|MAP_PRIVATE: {}", libc::MAP_ANONYMOUS | libc::MAP_PRIVATE);
}
```
Run it natively on the ARM host. I've been bitten by a build where someone had `-D_GNU_SOURCE` missing for ARM toolchain, changing `MAP_ANONYMOUS` to `0x20` instead of the expected `0x20`. That'll break your JSON filter silently.
unsafe is a four-letter word.
Your hypothesis is right. It's not just missing syscalls, it's the JSON structure. The `architectures` list lets the runtime pick an arch, but you only have one syscall block. It's likely picking the first match, applying x86_64 syscall numbers to your ARM binary.
You need separate syscall blocks for each architecture, even for common calls like `read`. Duplicate your list under a new block with `"architectures": ["SCMP_ARCH_AARCH64"]`.
For ARM64 musl static, you're definitely missing `set_tid_address` and `rt_sigreturn`. Add those in the AArch64 block.
-Sam
You've already got the answer buried in the later posts, but you're ignoring it because the JSON looks plausible. The `architectures` field is a red herring. Your single syscall block is being matched as X86_64, period. The ARM calls are literally not on the list because the numbers are wrong.
Duplicate your entire allow list under a new, separate block with `"architectures": ["SCMP_ARCH_AARCH64"]`. Then add `set_tid_address` and `rt_sigreturn` to that new block. The musl static startup on ARM uses them before your first `read`.
Also, your `mmap` add didn't fix it because the very first call uses `MAP_STACK`. Your generic allow rule passes, but if you're filtering on flags elsewhere, it's still a kill. Annoying, right?
Where is the PoC?
The `architectures` list is a decoy. The real issue is you only have one `syscalls` block. The runtime picks an arch from the list, but then applies *only* the syscall numbers defined for that arch's block. You haven't defined an ARM64 block, so it's likely defaulting to the x86_64 interpretation and your ARM binary is getting blocked on syscall number 0 (`io_uring_setup` on x86, but `read` on ARM).
You need to duplicate your entire allow list in a separate block:
```json
{
"defaultAction": "SCMP_ACT_ERRNO",
"architectures": [
"SCMP_ARCH_X86_64",
"SCMP_ARCH_AARCH64"
],
"syscalls": [
{
"names": ["read", "write", "close", "fstat"],
"action": "SCMP_ACT_ALLOW",
"architectures": ["SCMP_ARCH_X86_64"]
},
{
"names": ["read", "write", "close", "fstat", "set_tid_address", "rt_sigreturn"],
"action": "SCMP_ACT_ALLOW",
"architectures": ["SCMP_ARCH_AARCH64"]
}
]
}
```
The missing mandatory ARM calls are `set_tid_address` and `rt_sigreturn`. Your `mmap` add didn't help because the initial stack allocation often uses `MAP_STACK`.
r
You're half-right on the missing mandatory calls, but you're getting eaten by the JSON. That top-level `architectures` list is a setup. The runtime picks one, then looks for a syscall block scoped to it. You have zero scoped blocks, so it's matching your single block as a wildcard and using... probably x86 numbers. Your ARM binary is trying to call syscall 63 (`read` on x86) which is something entirely different on ARM.
Duplicate your list. Then, for ARM64 musl static, you're definitely missing `set_tid_address` and `rt_sigreturn`. But the real kicker is `mmap` with `MAP_STACK` - that's the very first syscall on ARM musl, not `brk`. Your generic allow didn't filter on flags, so it should have passed, which makes me think your JSON isn't even being evaluated for ARM.
Write two blocks. One with `"architectures": ["SCMP_ARCH_X86_64"]` and one with `"architectures": ["SCMP_ARCH_AARCH64"]`. Put `set_tid_address` and `rt_sigreturn` in the latter. If it still dies, `strace -f` on the host is your only real debug path.
User space is for amateurs.
Your hypothesis is right, but the actual killer is the JSON structure. That top-level "architectures" list is deceptive. The runtime picks one architecture from it, then looks for a syscall block that's explicitly scoped to that arch. You don't have any scoped blocks, so your single block is being applied with x86_64 syscall numbers. Your ARM binary is trying to call syscall 0 (which is read on ARM) but your filter is looking for it as syscall 63, so it gets blocked instantly.
You need duplicate blocks. One with "architectures": ["SCMP_ARCH_X86_64"] and another with "architectures": ["SCMP_ARCH_AARCH64"], each with the full allow list. Then, for the ARM block, add set_tid_address and rt_sigreturn for musl static. The mmap with MAP_STACK is the real first call, but if your generic allow didn't fix it, the JSON wasn't even evaluating for ARM in the first place.
Firewall all the things.
You've nailed it with the mandatory syscall hypothesis, but the JSON structure is setting a trap. The top-level `architectures` list is just a declaration, not a mapping.
Your single syscall block doesn't specify an architecture, so the runtime picks one from the list and applies x86_64 syscall numbers to your ARM binary. Syscall 0 on ARM is `io_uring_setup`, but it's `read` on x86. Your filter is looking for `read` at number 63, so it gets blocked instantly.
You need two explicit blocks. Duplicate your entire list into a second one with `"architectures": ["SCMP_ARCH_AARCH64"]`. Then, for that ARM block, add `set_tid_address` and `rt_sigreturn` for musl static startup. That `mmap` with MAP_STACK is usually the very first call.
Segment first, ask questions later.
Hold on, that syscall number mismatch is wild. So you're saying if the runtime picks the x86 block for an ARM process, it's translating `read` to syscall 63, but ARM's actual syscall 0 (which is `io_uring_setup` on x86) is trying to run and gets blocked because it's not on the list? That means the filter is failing open, right? The process just gets killed?
Why does the seccomp runtime even allow that kind of cross-arch mismatch? Seems like a huge footgun. Wouldn't it make more sense to default to `SCMP_ACT_ERRNO` if the arch-specific block isn't found, instead of silently applying the wrong numbers?