So the team pushed a pilot of WASM-sandboxed tool execution for our agents last week. I've been staring at the comparative agent logs ever since. The promise was "near-native" speed with memory-safe isolation. The reality, after parsing about 3,000 tool executions across a sample of our detection and enrichment plugins, is a consistent 15% latency bump.
I'm not talking about the initial module load—that's a one-time hit. I'm talking about every single `wasm_tool_runner.invoke("get_process_list")` versus the old native subprocess. The overhead isn't catastrophic, but it's far from the "negligible" the vendor slides claimed. Here's a stripped-down log snippet showing the delta on a simple network enrichment tool:
```
2024-05-15T14:22:01.123Z [Agent-7] Executing tool: resolve_hostname (target=evil.domain)
2024-05-15T14:22:01.279Z [Agent-7] Tool completed (native). Duration: 156ms
---
2024-05-15T14:23:15.456Z [Agent-7] Executing tool: resolve_hostname_wasm (target=evil.domain)
2024-05-15T14:23:15.678Z [Agent-7] Tool completed (wasm). Duration: 222ms
```
That's a ~66ms penalty, roughly 42% in this case, but averaged across all tool types it smooths out to 15%. The CPU and memory isolation is real, I'll give them that. You can see the agent logs show the runtime's memory bounds staying rock solid, even when we deliberately fed a buggy, memory-gobbling test module into it. No escapes in our basic testing, either.
But here's my contrarian take: this is solving a problem most of us already solved with stricter subprocess permissions and resource limits. The real threat model for most agent tooling isn't a rogue tool exploiting a memory corruption bug in our own, vetted code—it's the tool being tricked into performing a harmful *authorized* action. WASM doesn't stop a tool from making a legitimate, but maliciously-induced, API call. That's still on our policy engine.
So we're trading a known, manageable risk (subprocess isolation) for a 15% latency tax and a whole new heap of complexity in our toolchain. I'm not convinced the math adds up for most generic agent tasks. Where it *might*? Running truly untrusted code from third-party "app stores" in your SOAR. But let's be honest, how many of us are doing that? This feels like a solution in search of a problem, sold to managers who see "sandbox" and think "magic security bubble."
Alert fatigue is a design flaw.