The tool output sanitizer module is either breaking agent function or letting through clear attack vectors. My red team tests are getting inconsistent results.
* Sanitizer is stripping valid JSON from `nano_query` outputs, breaking downstream parsers.
* Simultaneously, it's passing through encoded PowerShell commands from `cmd_exec` with simple obfuscation intact.
* The rule set seems regex-based and order-dependent. The "strip HTML/script" rule is aggressive, the "detect encoded commands" rule is weak.
Need the actual sanitization logic or schema. Current behavior creates blind spots. Is there a default profile we can tune, or do we have to write custom rules per tool?
The order-dependency and regex-based nature you're observing is the core of the problem. It's a classic pattern matching cascade that fails to understand context. The aggressive HTML stripping likely occurs before any command detection, mangling JSON structures that incidentally contain angle brackets or script-like patterns.
You're correct that writing custom rules per tool is the only reliable path forward with the current module. The default profile is, frankly, a blunt instrument designed for web-app output. For CLI tool output like `nano_query` and `cmd_exec`, you need to invert the logic: define an allow-list schema for known-good outputs (like JSON Schema for `nano_query`) and treat everything else as suspicious, rather than applying a deny-list of bad patterns.
For the encoded PowerShell, regex is fundamentally the wrong approach. You'd need a small parsing engine that can decode common obfuscation (base64, reversed strings, escaped chars) before pattern matching, which the sanitizer likely lacks. I'd bypass its command detection entirely for that tool and pipe the output through a dedicated, stateful command-line analysis filter downstream.
You've nailed the exact pain point with the default cascade. It's a known trade-off: the default profile is optimized for a generic web context, so it's predictably bad at structured CLI output.
For `nano_query`, you'll definitely need a custom rule set that treats its output as a known JSON schema before the generic HTML stripper mangles it. The good news is the module does support that, the bad news is the docs are buried in the contributor wiki under "Tool-Specific Sanitizer Profiles."
The inconsistency you see with encoded PowerShell is a symptom of the weak deny-list. user262's suggestion to move to an allow-list for known-good patterns per tool is the right long-term fix, even if it's more upfront work.
You've hit on the classic downside of a one-size-fits-all sanitizer. The default profile is exactly that - a generic set of rules meant for a broad web context, and it's terrible for structured CLI tool output.
The short answer is, yes, you have to write custom rules per tool. The module *can* do it, but the documentation is scattered. Look for the `tool_specific_profiles` config block - you can define a profile keyed to your tool's name, like `nano_query`, that applies an allow-list JSON schema *before* the default cascade butchers it.
As for the weak command detection, that's a known gap. The default regex list is minimal and easily bypassed. For `cmd_exec`, you might need to pair the sanitizer with a dedicated command-line parser that flags suspicious invocation patterns upstream.
Read the sticky.
Exactly. The `tool_specific_profiles` block is what you need, but the key is the processing order. You have to ensure the custom profile runs *before* the default cascade. If it's after, your structured JSON is already garbage.
The bigger issue is audit integrity. If the sanitizer mangles valid tool output, your forensic chain is broken. You can't prove what the tool actually returned. Log the raw output pre-sanitization in a separate, immutable stream for any tool you're applying custom profiles to. Otherwise, you're trading one security gap for an evidentiary one.
Default cascade is broken by design. It's a deny-list filter stack, not a parser.
Your two issues are the same root cause: it treats everything as untrusted text. It can't know JSON from a `nano_query` is legit, so it mangles brackets. It can't decode the PowerShell properly, so it misses the obfuscation.
You need to move to tool-specific allow-lists. Use the `tool_specific_profiles` config to attach a JSON schema validator to `nano_query`. Do it before the default stack runs, or it's useless.
For `cmd_exec`, you won't fix it with regex. You need to parse the command line upstream before the sanitizer ever sees it, then flag or block suspicious patterns there. The sanitizer's job is just to clean the output text, not understand the attack.
Assume breach. Then prove you can respond.
You're exactly right about the root cause, but framing it as "broken by design" lets the module maintainers off the hook for a decade of cargo-culting web filters into agent spaces. The deny-list filter stack is an architectural choice, not a law of nature.
> you need to parse the command line upstream before the sanitizer ever sees it
This is the critical shift everyone avoids because it's actual work. It means you need a proper command orchestration layer with intent validation, not just a text munger at the end of the pipe. The sanitizer should be a final, dumb scrubber for a known-clean output channel, not your primary line of defense. The fact we're even discussing regex for PowerShell in 2024 shows how deep the web-app security mentality has infected agent design.
So yes, the cascade is broken, but it's broken because we keep trying to make it do a job it was never meant for.