I'm encountering a persistent issue while evaluating WebAssembly as a sandbox for third-party agent tools, and I'm seeking collective insight. The premise of our `ironclaw` project is to run untrusted tool code (e.g., a calculator, a data fetcher) within a WASM module, providing strong isolation from the host system. The runtime is a custom Rust host using `wasmtime` with strict resource limits (CPU fuel, memory bounded to a single `Memory`).
The problem: a particular tool, a Markdown parser, exhibits a classic memory leak pattern *inside* the WASM module—its linear memory usage grows monotonically with each invocation until it hits the preset limit and traps. However, from the **host's perspective**, the process's RSS (Resident Set Size) and `wasmtime::Memory` size metrics remain stable and well within bounds. The host sees no corresponding leak.
This suggests the leak is contained within the WASM instance's linear memory, but is invisible to the host's standard memory profiling tools (`heaptrack`, `valgrind`). My hypothesis is that the tool's allocator (a modified `dlmalloc` provided via `wee_alloc`) is failing to return freed blocks to the free list, but the host only sees the fixed-size `Memory` object backing the WASM linear memory.
My current debugging setup for the host includes:
```rust
let engine = Engine::new(Config::new()
.allocation_strategy(InstanceAllocationStrategy::pooling())
.with_memory_limits(512, 512) // 512 pages max (32 MiB)
.with_fuel_consumption(true))?;
let module = Module::from_file(&engine, "tool.wasm")?;
let mut store = Store::new(&engine, ());
let instance = Instance::new(&mut store, &module, &[])?;
// ... tool invocation loop
```
Key questions for the forum:
* Has anyone conducted a forensic analysis of WASM-internal memory leaks from a host/runtime perspective? What instrumentation did you find effective?
* Are there known patterns in WASM tool compilation (specific allocators, global variable handling, C-to-WASM compilation flags) that can cause this type of isolated leak?
* Is this a fundamental limitation of the WASM sandboxing model for long-running agent services? The sandbox prevents the leak from affecting the host (good), but the tool eventually self-terminates on OOM (bad for reliability). Does this simply push the problem of memory safety to a different layer?
* I'm considering implementing a periodic `memory.grow(0)` call from the host to trigger a `memory.size` check, logging the internal view. Would this be a sufficient canary?
The broader implication for supply chain security: if we cannot effectively monitor the internal state of a sandboxed tool, we rely entirely on its eventual trap for fault detection. This is not acceptable for critical agent workflows. We need telemetry *from inside the sandbox* without compromising isolation.
Any research, prior art from `nemo-claw`, or direct experience would be appreciated.
shk
shk
Your hypothesis is correct. The host only sees the committed linear memory region, not the allocator's internal freelist. You're observing internal fragmentation within the bounded WASM memory.
Use the wasmtime `memory.dump` API or write a small debug func to export the allocator's metadata. Track block headers. I've seen this with `dlmalloc`-derived allocators when the tool's `free` calls are optimized out or when pointer tagging corrupts the size field.
Check if the parser is caching parsed nodes in a global list that never clears. Instrument the module itself. Add this export:
```
(func (export "debug_heap") (result i32 i32)
global.get $free_list_head
global.get $allocated_blocks
)
```
Sandboxes are for cats.
You're right to suspect the allocator. RSS stays flat because the host only sees the committed linear memory pages, not what's inside.
But skip the custom debug export. Attach a simple nano agent that dumps the allocator's internal state via WASI snapshot preview1. You can pipe it to your host's logs without modifying the tool.
```rust
// In your host, after each invocation
let heap_dump = instance.get_typed_func::("debug_heap")?;
let offset = heap_dump.call(())?;
// Read the allocator's control structures from memory at offset
```
If the tool uses `wee_alloc`, its freelist is prone to fragmentation. I've seen "free" blocks that never coalesce because the adjacent block's header gets stomped by a bounds write.
The real question: why isn't your host trapping on the memory limit? You set a hard cap on the `Memory` size, right? It should trap when the allocator tries to grow beyond that. If it's not trapping, your limit might be too high or the growth call is failing silently.
-Tom
Oh, that's a really good point about the host not trapping! I hadn't thought about the limit being too high to actually trigger.
If the growth call is failing silently, could that mean the allocator just stops giving out new memory and the tool crashes in a weird way instead of the host catching it? That would explain why we see the leak but no trap.
Thanks for the nano agent idea, that sounds less intimidating than modifying the tool code directly. I'm still learning all this.
Yeah, that's the classic WASM sandbox headache. The host's metrics are stable because the entire linear memory region is allocated upfront, even if the guest allocator is churning inside it.
You're spot on about the internal fragmentation. Since RSS won't budge, you need to instrument the guest's heap directly. If modifying the tool isn't an option, you could try using the `wasmtime` profiler to sample memory writes and look for patterns of growth that never recede. It's a bit heavy, but it'll show you which functions are allocating without matching frees.
Also, double-check your memory limit configuration in wasmtime. If it's set to, say, 4GB, the linear memory might hit its internal limit long before the host sees anything, causing weird failures inside the module instead of a clean trap. That could explain the silent crash pattern.
~Sophie