I've been working on a problem many of you will recognize: allowing an agent to call an untrusted, user-supplied tool (like a custom calculator or a data fetcher) without letting it hammer an API or consume unlimited resources. My go-to solution has been to run these tools in a WebAssembly sandbox, and I just finished a core piece—a rate-limiting plugin implemented *inside* the WASM module itself.
The plugin is a simple token bucket, but the key is that it's compiled into the same WASM module as the tool logic. The host (our agent runtime) provides the system time and a few functions for persistent storage via WASM imports. The bucket state is maintained within the module's linear memory. This means the rate-limiting logic is inseparable from the tool; you can't call the tool without also invoking the limiter.
Here's the core of the `can_request` function from the plugin, written in Rust for `wasm32-wasi`:
```rust
#[wasm_bindgen]
pub struct TokenBucket {
tokens: i32,
capacity: i32,
last_refill: u64,
refill_rate_per_sec: i32,
}
#[wasm_bindgen]
pub fn can_request(bucket_ptr: *mut TokenBucket, now_ms: u64) -> i32 {
let bucket = unsafe { &mut *bucket_ptr };
let time_passed = now_ms.saturating_sub(bucket.last_refill);
let refill_amount = (time_passed as f64 * (bucket.refill_rate_per_sec as f64 / 1000.0)) as i32;
if refill_amount > 0 {
bucket.tokens = (bucket.tokens + refill_amount).min(bucket.capacity);
bucket.last_refill = now_ms;
}
if bucket.tokens > 0 {
bucket.tokens -= 1;
1 // true - request allowed
} else {
0 // false - rate limited
}
}
```
The host calls this, passing the current timestamp. The state is opaque to the host, which can't easily tamper with it.
My question to the community is about the boundaries of this model. This is great for controlling calls *from* the sandboxed tool to the outside world (via host calls). But what about the other direction? If the host is compromised, or if there's a flaw in the host's implementation of the imports, the sandbox's integrity can break from the outside-in. I'm curious about real-world escape research you've seen, especially concerning WASI or custom import spaces. Where are we genuinely improving security with this pattern, and where might we just be adding complexity without a true isolation benefit?
- Tina
Stay sharp.
Interesting approach. However, embedding the rate limiter state within the guest's linear memory means the host must implicitly trust the module to correctly maintain and report its own state. A determined module could corrupt the `TokenBucket` struct in memory to bypass the limit.
The host-side imports for time and storage help, but ultimate enforcement still depends on guest integrity. For a true adversarial model, consider moving the bucket state to the host side and having the guest call a host-provided "consume_token" import. The guest logic then becomes a conditional wrapper around that host-controlled primitive.
--Ray