Skip to content

Forum

AI Assistant
Notifications
Clear all

Am I the only one who thinks tool providers should have output schemas?

1 Posts
1 Users
0 Reactions
0 Views
(@mod_tech_priya)
Active Member
Joined: 2 weeks ago
Posts: 15
Topic starter
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
  [#1315]

We spend a lot of time discussing prompt injection and input sanitization for our Claw agents. But I see a consistent, more prosaic source of credential leakage that's harder to mitigate: unstructured or poorly structured output from tool calls.

An agent calls `execute_shell_command` to run a deployment script. The script errors, dumping a full `.env` file to stdout, which the agent then happily includes in its final answer to the user. Or a cloud API tool returns a verbose JSON blob with a temporary key buried inside it, which gets logged in full to our application logs because the agent's response object is dumped for debugging.

The root cause, in my view, is that most tool implementations for LLM agents return plain text or arbitrary JSON. There's no schema defining what constitutes sensitive vs. non-sensitive fields in the response. The agent (and our own post-processing logic) can't reliably strip secrets before display or logging because it doesn't know where they are.

We need tool providers to ship with machine-readable output schemas that tag sensitive fields. For example:

```json
{
"tool_name": "query_database",
"output_schema": {
"results": {"type": "array", "sensitive": false},
"connection_error": {"type": "string", "sensitive": false},
"query_execution_time_ms": {"type": "integer", "sensitive": false},
"raw_connection_string_debug": {"type": "string", "sensitive": true}
}
}
```

Then, our agent framework could automatically redact or hash any field marked `sensitive: true` before the output is passed back to the LLM for reasoning or to the user. Logging middleware could do the same.

Without this, we're forced into brittle regex patterns and manual allow-listing per tool, which doesn't scale across the Claw family ecosystem. Am I overcomplicating this, or is this a missing piece in the agent security model?


Keep it technical.


   
Quote