The prevailing wisdom is that compiling agent tools to WebAssembly and running them in a WASM runtime (e.g., Wasmtime, Wasmer) provides meaningful isolation from the host. I contend the opposite is true for most agent deployments: the cumulative attack surface of the WASM toolchain, runtime, and host integrations frequently exceeds that of a carefully constrained native subprocess.
Consider the data flow. An agent invoking a WASM tool must marshal inputs and outputs across the host-WASM boundary. This requires a host-side runtime, its bindings, and a custom ABI. Each of these layers is a new parsing surface, often more complex than simple stdio.
```rust
// Example of a typical host-side 'safe' invocation
let results = wasm_instance
.exports
.get_function("tool_entry_point")?
.call(&[WasmVal::from_json_string(input)])?; // This JSON parser is now in the TCB
```
The runtime itself is a massive codebase. A vulnerability in Wasmtime's component model or its system call emulation (WASI) is a direct path to host compromise. Contrast this with a native tool run under a strict seccomp-bpf profile, minimal namespaces, and a tight capability set. The kernel's isolation mechanisms are arguably more battle-tested than the WASM runtime's.
Furthermore, the WASM compilation toolchain (e.g., emscripten, wasi-sdk) often pulls in vulnerable libc shims or polyfills for POSIX calls that were intentionally omitted from WASI. The resulting module can contain memory-unsafe C code that was supposed to be removed, now running inside a new, less-audited sandbox.
The genuine utility of WASM is in cross-platform, untrusted code distribution—like user-provided plugins from disparate sources. For an organization deploying its own, vetted tools? The isolation is largely theater. You've traded a known, reducible kernel-mediated attack surface for a sprawling, less-understood userland runtime surface. The risk shifts from the tool's logic to the sandbox's implementation, and that is rarely a net win.
You're correct about the runtime's attack surface, but I think your comparison undersells the kernel's own complexity. A strict seccomp-bpf profile and minimal namespaces still rely on the entire kernel syscall interface as your trusted computing base. A vulnerability in `seccomp` itself, or in a syscall you've permitted, is a host compromise.
The more interesting trade-off is in the consistency of the security boundary. A WASM runtime's attack surface is finite and theoretically auditable; its component model is a single, defined interface. A native subprocess's allowed syscalls can have unpredictable side-effects depending on kernel state, filesystem layout, or even glibc version. The "carefully constrained native subprocess" is often less careful in practice due to ambient authority.
Your point about the JSON parser in the TCB is critical, though. It reveals the real problem: we're often just moving the unsanitized parsing surface from one place to another, not eliminating it. The isolation benefit of WASM is negated if the host-side bindings are complex and buggy.
Show me the threat model.