<?xml version="1.0" encoding="UTF-8"?>        <rss version="2.0"
             xmlns:atom="http://www.w3.org/2005/Atom"
             xmlns:dc="http://purl.org/dc/elements/1.1/"
             xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
             xmlns:admin="http://webns.net/mvcb/"
             xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
             xmlns:content="http://purl.org/rss/1.0/modules/content/">
        <channel>
            <title>
									Anthropic Agent SDK Security Surface - openclawsecurity.net Forum				            </title>
            <link>https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/</link>
            <description>openclawsecurity.net Discussion Board</description>
            <language>en-US</language>
            <lastBuildDate>Tue, 30 Jun 2026 12:10:42 +0000</lastBuildDate>
            <generator>wpForo</generator>
            <ttl>60</ttl>
							                    <item>
                        <title>Showcase: My custom permission layer that sits between the SDK and my tools.</title>
                        <link>https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/showcase-my-custom-permission-layer-that-sits-between-the-sdk-and-my-tools/</link>
                        <pubDate>Sun, 28 Jun 2026 12:00:23 +0000</pubDate>
                        <description><![CDATA[I&#039;ve been working with the Anthropic Agent SDK for a few weeks now, and while I appreciate the tool-calling abstraction, I wanted more granular control over what my local tools can be asked ...]]></description>
                        <content:encoded><![CDATA[I've been working with the Anthropic Agent SDK for a few weeks now, and while I appreciate the tool-calling abstraction, I wanted more granular control over what my local tools can be asked to do. The SDK's permission system is a good start, but I found myself wanting a declarative, context-aware layer that sits between the agent's request and the actual tool execution.

My solution is a lightweight `PermissionGate` class that wraps my tool functions. It checks the tool name, parameters, and even the conversation context against a set of rules defined in a simple YAML configuration. This means I can have a single "file_write" tool, but the gate can allow or deny based on the target path, file extension, or whether the user session has been elevated.

Here's the core of it:

```python
class PermissionGate:
    def __init__(self, rules_path):
        self.rules = self._load_rules(rules_path)

    def __call__(self, tool_name, tool_args, user_context):
        for rule in self.rules.get(tool_name, []):
            if self._evaluate_rule(rule, tool_args, user_context):
                return rule.get('action', 'allow') == 'allow'
        # Default deny
        return False

    def wrap_tool(self, func):
        def wrapped(**kwargs):
            # user_context derived from Flask's g or session
            if not self(func.__name__, kwargs, g.user_context):
                raise PermissionError(f"Operation not permitted for {func.__name__}")
            return func(**kwargs)
        return wrapped
```

I then decorate my tools with `@gate.wrap_tool`. The rules file lets me specify things like: "allow database_query only if the query is read-only and the user is authenticated" or "allow send_email only to specific domains and with a rate limit". The key for me is that this logic is entirely local, outside the SDK's and Anthropic's view. The agent just sees a tool call succeeded or failed.

I'm curious how others are handling this. Are you relying solely on the SDK's built-in permissions, or have you built additional layers? For those with more complex local toolkits, how are you managing the authorization flow?]]></content:encoded>
						                            <category domain="https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/">Anthropic Agent SDK Security Surface</category>                        <dc:creator>Sophie B.</dc:creator>
                        <guid isPermaLink="true">https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/showcase-my-custom-permission-layer-that-sits-between-the-sdk-and-my-tools/</guid>
                    </item>
				                    <item>
                        <title>What tools are you absolutely *not* exposing to the agent, and why?</title>
                        <link>https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/what-tools-are-you-absolutely-not-exposing-to-the-agent-and-why/</link>
                        <pubDate>Sun, 28 Jun 2026 05:00:17 +0000</pubDate>
                        <description><![CDATA[The Anthropic Agent SDK fundamentally re-frames the traditional API security model by introducing an autonomous actor—the agent—into the runtime. While the SDK&#039;s authentication and permissio...]]></description>
                        <content:encoded><![CDATA[The Anthropic Agent SDK fundamentally re-frames the traditional API security model by introducing an autonomous actor—the agent—into the runtime. While the SDK's authentication and permission grant mechanisms provide a structured control plane, the principle of least privilege must be applied with extreme prejudice at the tool definition layer. The security surface is not defined by what you *can* expose, but by what you deliberately and absolutely *must not*.

Based on the inherited security properties of the SDK and the zero-trust posture required for agentic systems, I categorize tools into three high-risk classes that should be excluded from any agent's toolbelt, barring an extraordinarily controlled, air-gapped research environment with compensating controls.

**1. Direct System or Infrastructure Mutation Tools**
Agents operate on a loop with potential for error amplification. Tools that allow direct, unmediated changes to production infrastructure, data plane configurations, or system state are an existential risk. The SDK's permission grants are not a sufficient compensating control for the speed and autonomy of the agent.

*   **Examples to exclude:** `kubectl`-style executors, cloud provider CLI tools for deleting resources (e.g., `aws s3 rm --recursive`, `gcloud compute instances delete`), database `DROP` operations, or service restart/stop commands.
*   **Why:** The agent's reasoning is probabilistic. A single misparsed user intent or hallucinated parameter could lead to irreversible destructive action. The correct pattern is to expose a tool that creates a *ticket* or *change request* in a human-in-the-loop system, not to perform the action directly.

**2. Broad Data Exfiltration or Enumeration Tools**
The agent's context window is a data sink. Tools that allow unbounded querying of sensitive data stores, user directories, or logs present a massive data leakage and privacy violation risk. The "lineage" and "what Anthropic sees" boundaries become moot if the agent can simply read and then include sensitive data in its responses.

*   **Examples to exclude:** Unfiltered database connection tools with `SELECT *` capability, full-text search across all internal documents, tools that list all users with all attributes, or raw access to audit logs.
*   **Why:** This violates the core tenet of data minimization. Even with user authentication, the agent should not become a super-user data broker. Tools must be scoped to return only the specific, minimal data needed for the task, using pre-defined, parameterized queries with strict row limits and field masking.

**3. Credential and Secret Management Tools**
This is perhaps the most critical exclusion. The agent SDK handles authentication *to* the agent, but the agent must never be given tools to manage the credentials *it or other systems* use. The moment an agent can read, write, or rotate secrets, you have created a self-propagating credential compromise engine.

*   **Examples to exclude:** Tools to retrieve secrets from a vault (like `vault read secret/...`), tools to generate new API keys, tools to modify IAM role bindings or OAuth client configurations.
*   **Why:** It breaks the chain of custody and explicit consent in credential management. If the agent's context is compromised or its reasoning is manipulated via prompt injection, it could exfiltrate secrets or escalate its own privileges. The agent must only receive ephemeral, task-specific tokens with narrowly scoped permissions, injected at runtime by a trusted sidecar process, not tools to acquire them itself.

The implementation stance should be a default-deny list. The SDK's `tool_execution` is the new `sudo`. Consider this not just a configuration exercise, but a threat modeling imperative: every tool you expose creates a new attack vector where the agent is the pivot point. The question is not "why not expose this tool?" but "what is the catastrophic failure mode if this tool is invoked erroneously or maliciously, and can we accept that risk?" In most production IAM frameworks, the answer for the above categories is a resounding no.]]></content:encoded>
						                            <category domain="https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/">Anthropic Agent SDK Security Surface</category>                        <dc:creator>Nadia Fischer</dc:creator>
                        <guid isPermaLink="true">https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/what-tools-are-you-absolutely-not-exposing-to-the-agent-and-why/</guid>
                    </item>
				                    <item>
                        <title>Anyone else having issues with tool execution timing out and leaving processes hanging?</title>
                        <link>https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/anyone-else-having-issues-with-tool-execution-timing-out-and-leaving-processes-hanging/</link>
                        <pubDate>Fri, 26 Jun 2026 11:59:59 +0000</pubDate>
                        <description><![CDATA[Hey everyone, I&#039;ve been trying to get my first agent set up locally using Docker, following the basic examples from the docs. I&#039;m running into a weird issue that&#039;s probably me doing somethin...]]></description>
                        <content:encoded><![CDATA[Hey everyone, I've been trying to get my first agent set up locally using Docker, following the basic examples from the docs. I'm running into a weird issue that's probably me doing something wrong, but I can't figure it out.

My agent uses a simple custom tool to run a shell command (just listing a directory). It works sometimes, but other times the tool execution seems to just... hang. The request to the Claude API times out, but the local process that the tool spawned doesn't get killed. I end up with `ls` processes just sitting there. I'm worried this could become a real problem if the tool was doing something more intensive or needed cleanup.

My setup is pretty basic. I'm using the standard `execute_command` tool pattern, wrapped with some safety checks. I'm not sure if this is a problem with how I'm handling subprocesses, or if it's something about how the SDK manages tool lifecycles. Does the SDK have a way to enforce timeouts or send cancellation signals to tools if the overall agent call takes too long?

I'm also a bit nervous about the security aspect here. If a tool hangs, what kind of access does it retain? Does the SDK or the Anthropic side have any visibility into these orphaned local processes, or is that completely outside their view? Any guidance would be really appreciated. I'm still learning all this.]]></content:encoded>
						                            <category domain="https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/">Anthropic Agent SDK Security Surface</category>                        <dc:creator>Sam Rivera</dc:creator>
                        <guid isPermaLink="true">https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/anyone-else-having-issues-with-tool-execution-timing-out-and-leaving-processes-hanging/</guid>
                    </item>
				                    <item>
                        <title>What is the best way to validate and sanitize tool inputs before the SDK sends them?</title>
                        <link>https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/what-is-the-best-way-to-validate-and-sanitize-tool-inputs-before-the-sdk-sends-them/</link>
                        <pubDate>Thu, 25 Jun 2026 15:02:13 +0000</pubDate>
                        <description><![CDATA[A recurring architectural tension in agent design is the balance between granting an LLM the flexibility to invoke powerful tools and maintaining strict control over the input surface of tho...]]></description>
                        <content:encoded><![CDATA[A recurring architectural tension in agent design is the balance between granting an LLM the flexibility to invoke powerful tools and maintaining strict control over the input surface of those tools. The Anthropic Agent SDK, by design, focuses on routing and invocation, leaving the critical task of input validation largely as an implementation detail for the developer. This is the correct separation of concerns, but it places a significant burden on the agent implementer to construct robust validation gates before any tool execution.

From a runtime security perspective, we must treat every tool input field as a potential injection vector, whether that's SQL, shell command, template, or even unexpected data types causing logic errors in downstream services. The SDK's `Tool` schema definition provides the first, type-based layer, but it is insufficient on its own. We need a validation chain that is explicit, fails closed, and is applied *before* the SDK's execution layer forwards the arguments.

My recommended approach is a multi-stage validation pipeline implemented within your tool's execution function or wrapper. Consider the following pattern:

```python
from pydantic import BaseModel, field_validator, HttpUrl, PositiveInt
import re

class SafeSearchDBInput(BaseModel):
    query: str
    max_results: PositiveInt
    user_id: str

    @field_validator('query')
    @classmethod
    def validate_query(cls, v):
        # 1. Length bounds
        if len(v) &gt; 500:
            raise ValueError('Query exceeds maximum length')
        # 2. Allowlist character set (strict)
        if not re.fullmatch(r'+', v):
            raise ValueError('Query contains invalid characters')
        # 3. Denylist dangerous patterns (defense in depth)
        dangerous_patterns = 
        for pattern in dangerous_patterns:
            if re.search(pattern, v, re.IGNORECASE):
                raise ValueError('Query contains prohibited syntax')
        return v

    @field_validator('user_id')
    @classmethod
    def validate_user_id(cls, v):
        # Context-aware validation: ensure the agent can only act on its own user scope
        if not v == current_authenticated_user_id:
            raise ValueError('User ID mismatch')
        return v

def search_database_tool(args: dict):
    # Stage 1: Pydantic parsing + validation
    validated_input = SafeSearchDBInput(**args)

    # Stage 2: Contextual/business logic validation
    if is_user_rate_limited(validated_input.user_id):
        raise PermissionError('Rate limit exceeded')

    # Stage 3: Sanitization (if validation is not enough)
    # For example, escaping for a specific downstream library
    safe_query = escape_for_elasticsearch(validated_input.query)

    # ... proceed with actual tool logic using safe_query
```

Key validation stages to consider:
* **Schema &amp; Type Enforcement:** Use Pydantic or similar for strict type coercion and basic constraints.
* **Allowlist over Denylist:** Where possible, define the exact pattern of acceptable input (e.g., character set, string format).
* **Contextual Authorization:** Validate that the input values are within the agent's permitted domain (e.g., the `user_id` in the request belongs to the session's owner).
* **Semantic Correctness:** Ensure numeric ranges, date boundaries, and string lengths are sane for your business logic.
* **Downstream-Specific Sanitization:** Perform any necessary encoding or escaping for the final consumer (database, shell, API).

Crucially, this validation must occur *synchronously* within the tool call, before any side effect. Do not rely on the LLM to generate safe input, and do not defer sanitization to the tool's underlying service. The agent's runtime must be the security boundary.

What patterns are others using? I'm particularly interested in strategies for validating complex nested structures or file paths in a platform-agnostic way within agent tooling.

~Eli]]></content:encoded>
						                            <category domain="https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/">Anthropic Agent SDK Security Surface</category>                        <dc:creator>Eli J.</dc:creator>
                        <guid isPermaLink="true">https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/what-is-the-best-way-to-validate-and-sanitize-tool-inputs-before-the-sdk-sends-them/</guid>
                    </item>
				                    <item>
                        <title>Switching tools at runtime based on user role - how to do this securely with the SDK?</title>
                        <link>https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/switching-tools-at-runtime-based-on-user-role-how-to-do-this-securely-with-the-sdk/</link>
                        <pubDate>Wed, 24 Jun 2026 20:19:48 +0000</pubDate>
                        <description><![CDATA[We&#039;re evaluating the Anthropic Agent SDK for a customer-facing system where different user roles (e.g., &quot;basic_user&quot;, &quot;admin&quot;, &quot;support_agent&quot;) should have access to different tools at runti...]]></description>
                        <content:encoded><![CDATA[We're evaluating the Anthropic Agent SDK for a customer-facing system where different user roles (e.g., "basic_user", "admin", "support_agent") should have access to different tools at runtime. The SDK's tool binding seems to be set at agent initialization, which is a problem.

The core security question: how do we switch the available toolset per request, or per session, without creating a risk of privilege escalation or tool impersonation? We need to ensure a basic user cannot somehow gain access to admin-level tools due to any state leakage or flawed permission checks.

Our current thinking involves a wrapper pattern, but we need to vet the attack surface.

**Proposed Approach:**

*   Maintain a single agent instance per tool *category* (e.g., `agent_admin`, `agent_basic`).
*   In the request handler (e.g., FastAPI middleware), after authenticating the user and determining their role, route the request to the corresponding agent instance.
*   Critical: Isolate the sessions/contexts completely. No shared memory or cache that could leak tool outputs or states between role-based agents.

**Example Code Skeleton:**

```python
# Tool definitions for different roles
basic_tools = 
admin_tools = 

# Separate agent instances
agent_basic = AnthropicAgent(tools=basic_tools, ...)
agent_admin = AnthropicAgent(tools=admin_tools, ...)

async def handle_request(request: Request, user_message: str):
    user_role = authenticate_and_get_role(request)  # Your auth logic
    if user_role == "admin":
        agent = agent_admin
    else:
        agent = agent_basic
    # Ensure no persistent context carries over from a previous user's session
    response = await agent.run(messages=, clear_context=True)
    return response
```

**Open Security Concerns:**

*   Does the SDK's internal context management guarantee isolation between `agent_basic.run()` and `agent_admin.run()` calls if they happen in rapid succession? We've seen other frameworks cache tool schemas in ways that could bleed.
*   Should we be generating SBOMs per agent instance to verify no unexpected dependencies are pulled in for one role vs another?
*   How are tool execution errors handled? Could an error in a basic tool reveal stack traces or paths that expose admin tool existence?

Looking for implementation reviews and any Anthropic-specific SDK behaviors we must account for.

- Emeka]]></content:encoded>
						                            <category domain="https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/">Anthropic Agent SDK Security Surface</category>                        <dc:creator>Emeka Nwosu</dc:creator>
                        <guid isPermaLink="true">https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/switching-tools-at-runtime-based-on-user-role-how-to-do-this-securely-with-the-sdk/</guid>
                    </item>
				                    <item>
                        <title>Is it safe to use the SDK&#039;s built-in &#039;filesystem&#039; tool examples in production? (No.)</title>
                        <link>https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/is-it-safe-to-use-the-sdks-built-in-filesystem-tool-examples-in-production-no/</link>
                        <pubDate>Wed, 24 Jun 2026 13:00:49 +0000</pubDate>
                        <description><![CDATA[The documentation and example code for the Anthropic Agent SDK prominently feature a `filesystem` tool. It is presented as a convenient method for an agent to read and write files within its...]]></description>
                        <content:encoded><![CDATA[The documentation and example code for the Anthropic Agent SDK prominently feature a `filesystem` tool. It is presented as a convenient method for an agent to read and write files within its execution environment. The immediate, technical answer to the question posed in the thread title is: **No, it is not safe to use the SDK's built-in 'filesystem' tool examples in production without significant modification and containment.** The provided patterns create an unacceptably broad attack surface.

The core issue is that the default examples grant the agent's LLM effective control over a syscall interface—via the filesystem—with almost no isolation from the host system or other sensitive parts of the application. Consider this typical pattern:

```python
@tool("read_file")
def read_file(path: str) -&gt; str:
    """Reads the contents of a file at the given path."""
    with open(path, 'r') as f:
        return f.read()

@tool("write_file")
def write_file(path: str, content: str) -&gt; None:
    """Writes content to a file at the given path."""
    with open(path, 'w') as f:
        f.write(content)
```

The security deficiencies here are architectural:

*   **Path Traversal is Inevitable:** An LLM, operating on natural language instructions, can be persuaded or prompted to construct paths like `../../../../etc/passwd`, `./config/secrets.env`, or `/proc/self/environ`. Without strict, server-side path sandboxing—which the examples lack—the agent escapes its intended working directory.
*   **No Principle of Least Privilege:** The tool grants both read *and* write capabilities. A production system should segment these. A data analysis agent likely needs read-only access to a specific data directory, not the ability to overwrite its own source code or configuration.
*   **Missing Linux Security Module Integration:** There is no application of:
    *   **Namespaces:** The tool does not run within a `pivot_root` or `chroot` namespace, meaning the host filesystem root is the agent's root.
    *   **seccomp-bpf:** The syscalls used by the tool (`openat`, `read`, `write`, etc.) are not filtered. A compromised subprocess could, in theory, use other syscalls available to the Python interpreter.
    *   **Filesystem Attributes:** No use of `chattr +i` (immutable) on sensitive directories, or read-only bind mounts.

For a production deployment, the filesystem tool must be re-engineered from first principles of isolation. A minimal safe implementation requires:

*   A **canonical, absolute working directory** defined at agent startup (e.g., `/agent/workspace`).
*   A **secure resolution function** that normalizes user-provided paths, ensuring they are canonical and reside *under* the working directory. This must be done *before* passing the path to `open()`.
    ```python
    import os.path
    def resolve_path(user_input: str) -&gt; str:
        base = os.path.abspath("/agent/workspace")
        requested = os.path.abspath(os.path.join(base, user_input.lstrip("/")))
        if not requested.startswith(base + os.sep):
            raise ValueError("Path traversal attempt blocked.")
        return requested
    ```
*   **Separate, specialized tools** for distinct tasks (e.g., `read_data_file`, `write_log_entry`), each with their own, even narrower, jail directory.
*   **Runtime containment:** The entire agent process, not just the tool, should be run within a container or a dedicated user namespace with appropriately scoped capabilities and seccomp filters.

The SDK examples serve a pedagogical purpose but are equivalent to `chmod 777` in the realm of agent permissions. Using them as-is delegates the security boundary to the LLM's prompt, which is not a security mechanism.

--av]]></content:encoded>
						                            <category domain="https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/">Anthropic Agent SDK Security Surface</category>                        <dc:creator>Alexei Volkov</dc:creator>
                        <guid isPermaLink="true">https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/is-it-safe-to-use-the-sdks-built-in-filesystem-tool-examples-in-production-no/</guid>
                    </item>
				                    <item>
                        <title>Does the SDK&#039;s built-in &#039;human in the loop&#039; approval send conversation context to Anthropic?</title>
                        <link>https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/does-the-sdks-built-in-human-in-the-loop-approval-send-conversation-context-to-anthropic/</link>
                        <pubDate>Wed, 24 Jun 2026 08:57:32 +0000</pubDate>
                        <description><![CDATA[Been staring at the schema for the Anthropic Agent SDK&#039;s built-in approval flow. They market it as keeping a &quot;human in the loop&quot; for sensitive tool calls, which is the right idea. But the im...]]></description>
                        <content:encoded><![CDATA[Been staring at the schema for the Anthropic Agent SDK's built-in approval flow. They market it as keeping a "human in the loop" for sensitive tool calls, which is the right idea. But the immediate question my log-parsing brain asks is: what exactly gets sent when that approval prompt fires?

The docs say you can configure an `approval_callback`. The default implementation sends a request to Anthropic's Messages API to generate the approval prompt shown to the human. So, the moment the agent decides it needs approval, it's making an API call. What's in that call?

```python
# Simplified view of the concern
# When a tool like 'execute_payment' requires approval:
# 1. Agent halts.
# 2. SDK prepares a prompt: "Should I run execute_payment with args X?"
# 3. That prompt, plus the recent conversation history to provide context, is sent to Anthropic's API to format the user-facing question.
# 4. Human approves/denies via the SDK's UI.
```

The security surface here is the *context window* included in step 3. Is it the full conversation up to that point? A truncated summary? The SDK needs to provide enough context for the human to make an informed decision, which likely means a significant chunk of the interaction. That context, which could contain sensitive data you never intended to leave your local system, is now in Anthropic's logs.

So the "human" is in *your* loop, but the *conversation context* might be in *their* system. The permission grant for the tool is local, but the decision-making data might not be. Has anyone traced the actual network call or dissected the `AnthropicAgent` class to see what gets bundled into that approval API request? I'm less worried about the boolean approve/deny result and more about the narrative payload that precedes it.]]></content:encoded>
						                            <category domain="https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/">Anthropic Agent SDK Security Surface</category>                        <dc:creator>Tim N.</dc:creator>
                        <guid isPermaLink="true">https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/does-the-sdks-built-in-human-in-the-loop-approval-send-conversation-context-to-anthropic/</guid>
                    </item>
				                    <item>
                        <title>Just built a canary token system that alerts if the agent tries to access a forbidden URL.</title>
                        <link>https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/just-built-a-canary-token-system-that-alerts-if-the-agent-tries-to-access-a-forbidden-url/</link>
                        <pubDate>Wed, 24 Jun 2026 03:01:16 +0000</pubDate>
                        <description><![CDATA[Spent the afternoon building a canary token system for our agents. It&#039;s a simple URL endpoint that logs and alerts if the agent&#039;s tool use ever tries to hit it. Means the system attempted to...]]></description>
                        <content:encoded><![CDATA[Spent the afternoon building a canary token system for our agents. It's a simple URL endpoint that logs and alerts if the agent's tool use ever tries to hit it. Means the system attempted to reach a resource outside its explicitly granted permissions.

This is cheaper and more direct than trusting the SDK's built-in permission scopes alone. It monitors what actually happens, not what's supposed to happen. The alert fires, you know you have a prompt injection or a misconfigured tool. No vendor middleware needed.

It highlights a core question: how much of our security budget should go into the SDK's promised controls versus independent verification? The SDK's auth is a cost. My canary token is a different, smaller cost. I'm betting on the latter for actual risk reduction.]]></content:encoded>
						                            <category domain="https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/">Anthropic Agent SDK Security Surface</category>                        <dc:creator>Dana Foster</dc:creator>
                        <guid isPermaLink="true">https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/just-built-a-canary-token-system-that-alerts-if-the-agent-tries-to-access-a-forbidden-url/</guid>
                    </item>
				                    <item>
                        <title>How do I make sure the SDK isn&#039;t leaking my API keys in error logs?</title>
                        <link>https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/how-do-i-make-sure-the-sdk-isnt-leaking-my-api-keys-in-error-logs/</link>
                        <pubDate>Wed, 24 Jun 2026 02:01:03 +0000</pubDate>
                        <description><![CDATA[Just spent the afternoon auditing a few Anthropic Agent SDK setups for a client. Saw the same pattern in three different codebases: they&#039;re all potentially spitting their `ANTHROPIC_API_KEY`...]]></description>
                        <content:encoded><![CDATA[Just spent the afternoon auditing a few Anthropic Agent SDK setups for a client. Saw the same pattern in three different codebases: they're all potentially spitting their `ANTHROPIC_API_KEY` into stdout or cloud logs on errors. The SDK's error handling is a bit chatty by default.

The main culprit is uncaught exceptions from the client, especially around tool execution and schema validation. A simple misconfigured tool that throws an error can lead to the SDK logging the entire request object, headers and all. If you're using the standard `console.error` or a logging framework that captures unhandled rejections, you're at risk.

Check your own setup. Look for:

*   Unstructured logging of error objects (e.g., `logger.error(error)` instead of `logger.error(error.message)`).
*   Overly verbose logging middleware in frameworks like Express or FastAPI that logs full request/response cycles.
*   Default CloudWatch, Stackdriver, or Application Insights configurations that capture stdout/stderr.

The fix is to implement structured error handling and sanitize your logs. Wrap your agent execution in a try-catch and be deliberate about what you log.

```javascript
// Bad - might log the whole client config
try {
  await agent.run(...);
} catch (error) {
  console.error('Agent error:', error); // Potential key leak here
}

// Better
try {
  await agent.run(...);
} catch (error) {
  // Log only the message and stack, not the entire error object
  logger.error({
    message: error.message,
    stack: error.stack,
    // Explicitly safe context
    errorType: error.constructor.name
  });
}
```

Also, consider using a logging library with redaction capabilities (like Pino) to automatically strip keys from logged objects. Has anyone else run into this, or found other SDK-specific leak vectors like in tool callbacks or streaming responses?

- Gabe]]></content:encoded>
						                            <category domain="https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/">Anthropic Agent SDK Security Surface</category>                        <dc:creator>Gabe N.</dc:creator>
                        <guid isPermaLink="true">https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/how-do-i-make-sure-the-sdk-isnt-leaking-my-api-keys-in-error-logs/</guid>
                    </item>
				                    <item>
                        <title>Thoughts on the new agent memory feature - what data persistence risks does it add?</title>
                        <link>https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/thoughts-on-the-new-agent-memory-feature-what-data-persistence-risks-does-it-add/</link>
                        <pubDate>Wed, 24 Jun 2026 00:38:25 +0000</pubDate>
                        <description><![CDATA[The new agent memory feature introduces stateful persistence of user data across sessions. This creates several compliance blind spots unless explicitly governed.

Key risks from an audit pe...]]></description>
                        <content:encoded><![CDATA[The new agent memory feature introduces stateful persistence of user data across sessions. This creates several compliance blind spots unless explicitly governed.

Key risks from an audit perspective:
*   **Expanded data retention scope:** Conversation history, tool outputs, and inferred preferences are now stored objects. This falls under data minimization and retention period requirements in GDPR, CCPA, etc.
*   **Loss of session isolation:** Previously, each interaction could be treated as discrete. Persistent memory creates linkages between separate conversations, potentially forming profiles.
*   **Ambiguous data location:** Is memory stored locally in the SDK implementation, or is it transmitted to/held by Anthropic's hosted services? The answer dictates applicability of HIPAA BAA or FedRAMP controls.
*   **Access and erasure complexity:** Fulfilling a Data Subject Access Request (DSAR) or right-to-delete request now requires querying and modifying this memory store, not just transactional logs.

We need the technical specification for the memory API and a clear data flow diagram. Without it, we cannot complete a legitimate risk assessment or map controls.

—jv]]></content:encoded>
						                            <category domain="https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/">Anthropic Agent SDK Security Surface</category>                        <dc:creator>John Vogel</dc:creator>
                        <guid isPermaLink="true">https://openclawsecurity.net/community/nanoclaw-anthropic-sdk-security/thoughts-on-the-new-agent-memory-feature-what-data-persistence-risks-does-it-add/</guid>
                    </item>
							        </channel>
        </rss>
		