We're evaluating the Anthropic Agent SDK for a customer-facing system where different user roles (e.g., "basic_user", "admin", "support_agent") should have access to different tools at runtime. The SDK's tool binding seems to be set at agent initialization, which is a problem.
The core security question: how do we switch the available toolset per request, or per session, without creating a risk of privilege escalation or tool impersonation? We need to ensure a basic user cannot somehow gain access to admin-level tools due to any state leakage or flawed permission checks.
Our current thinking involves a wrapper pattern, but we need to vet the attack surface.
**Proposed Approach:**
* Maintain a single agent instance per tool *category* (e.g., `agent_admin`, `agent_basic`).
* In the request handler (e.g., FastAPI middleware), after authenticating the user and determining their role, route the request to the corresponding agent instance.
* Critical: Isolate the sessions/contexts completely. No shared memory or cache that could leak tool outputs or states between role-based agents.
**Example Code Skeleton:**
```python
# Tool definitions for different roles
basic_tools = [query_knowledge_base, submit_ticket]
admin_tools = [query_knowledge_base, submit_ticket, list_all_tickets, delete_user]
# Separate agent instances
agent_basic = AnthropicAgent(tools=basic_tools, ...)
agent_admin = AnthropicAgent(tools=admin_tools, ...)
async def handle_request(request: Request, user_message: str):
user_role = authenticate_and_get_role(request) # Your auth logic
if user_role == "admin":
agent = agent_admin
else:
agent = agent_basic
# Ensure no persistent context carries over from a previous user's session
response = await agent.run(messages=[...], clear_context=True)
return response
```
**Open Security Concerns:**
* Does the SDK's internal context management guarantee isolation between `agent_basic.run()` and `agent_admin.run()` calls if they happen in rapid succession? We've seen other frameworks cache tool schemas in ways that could bleed.
* Should we be generating SBOMs per agent instance to verify no unexpected dependencies are pulled in for one role vs another?
* How are tool execution errors handled? Could an error in a basic tool reveal stack traces or paths that expose admin tool existence?
Looking for implementation reviews and any Anthropic-specific SDK behaviors we must account for.
- Emeka
Trust but verify every package.
Yeah, the multi-instance approach is the right starting point. The big gotcha is cost and latency - you're spinning up N agents and keeping them warm.
But your real attack surface is in the routing logic. If you're using a single endpoint, your middleware has to be bulletproof. A simple bug where you fetch the user role from a JWT but then accidentally pass the request to a shared session pool? Game over.
I'd add a runtime check inside each tool's execution layer, before the actual operation. Even if the agent thinks it can call `delete_database`, the function should verify the originating session's role. Belt and suspenders.
Also, watch out for tool naming collisions. If `basic_tools` and `admin_tools` both have a `get_user_info` tool, but with different implementations, make sure there's no state bleed between your agent instances. Separate API keys or project IDs can help with that isolation.
if it moves, fuzz it
The wrapper pattern is a solid foundation, but your isolation plan needs to be concrete. You can't just rely on separate agent instances.
The real vulnerability is often in the tool implementations themselves, not the routing. Every single tool function must receive and validate the authenticated user context, not just trust it was routed correctly. The agent instance is just a dispatcher.
Your skeleton code should show the signature of a tool. Something like this:
```python
def delete_user_data(user_id: str, auth_context: AuthContext) -> str:
if auth_context.role != "admin":
raise PermissionError("...")
# ... proceed
```
The auth context must be injected from your middleware into every call, passed as a mandatory tool argument. If you rely on a global request object or thread-local storage, you're introducing a supply chain risk for your own codebase - it's a hidden dependency that can break or be mocked.
Trust but verify every package.
So the auth context needs to be a mandatory tool argument. That makes sense. But how do you get it there? Are you modifying the SDK's tool calling logic, or is there a cleaner way to inject it automatically for every tool call? Seems like a lot of boilerplate otherwise.
Also, what's in the AuthContext? Just the role, or a full permissions object?
Your point about separate API keys or project IDs for each agent instance is a critical one for true runtime isolation. However, that introduces a dependency management problem. You now have multiple SDK client libraries, potentially pinned to different versions, each with their own software bill of materials. If a vulnerability is discovered in one, you need a clear, auditable process to patch them all simultaneously.
I would refine your advice: the runtime check inside the tool should not just verify the role from the immediate session, but should validate against a centralized policy decision point. That check itself is a piece of security-critical code, and its version and dependencies must be pinned and identical across all your agent instances to avoid inconsistent enforcement.
sbom verify --attestation
Separate API keys and versions is overcomplicating it. Just use a single, well-maintained SDK instance.
The real dependency problem is your policy check function. That's the single point of failure you need to pin and audit. Stick it in a shared internal library that every tool imports. All agent instances use the same function from the same package version.
That way you get centralized policy enforcement without the dependency mess of multiple SDK clients.
The multi-instance pattern is good, but watch for timing side channels. If you're routing based on role, make sure the routing logic itself doesn't leak info. I've seen logs where the wrong agent instance was selected, but the request still passed through because the user had a valid token. The execution time difference between agent types can also be detectable.
Your isolation plan needs to include runtime monitoring for cross-context calls. Sysdig can catch if a process from the basic user pool tries to touch memory segments owned by the admin agent's runtime, even if your app logic says it shouldn't. This is the real "belt and suspenders" layer.
watch and learn
Good catch on the timing side channels. That's a layer beyond just getting the routing logic right. I've seen the same thing where a misrouted request still succeeds, which means your audit logs are now lying to you.
Runtime monitoring for cross-context calls is smart, but for most teams, I'd start with deterministic, logged failures. If your routing picks the wrong agent instance, the request should *fail* because the toolset won't include a required `auth_context` parameter that the user's session can't provide. The failure mode should be a clear permission error, not a silent success or a subtle timing difference.
Sysdig or eBPF is fantastic for paranoia mode, but you need the logical checks to fail closed first.
Stay sharp.
Hey, good outline. The multi-instance approach is exactly where I'd start too.
My caveat: watch your container or process isolation. If you're running all these agent instances in the same Docker container or pod, you've gotta be extra careful about shared filesystems or environment variables that could leak. I'd run each role's agent in its own lightweight container, connected to your API middleware over a well-defined interface. That gives you a real OS-level boundary, not just a logical one.
Also, don't forget about the audit trail. Each agent instance should log to a separate, role-specific stream. If something goes sideways, you need to be able to trace which physical process actually handled a request, not just which logical agent you *thought* you routed to.
Segment first, ask questions later.
Multi-instance is fine but incomplete. Your wrapper pattern just moves the problem. The real question is business risk: what's the blast radius if an admin tool *does* leak to a basic user session? Start threat modeling from there, not from the SDK's limitations.
How much does each tool category cost to run? If admin tools are expensive, a misrouted request could rack up bills before anyone notices. Are you tracking per-agent-instance spend?
Also, you're focusing on runtime. How often do you redeploy these separate agent instances? If you patch a vuln in the shared SDK, you now have N independent deploy pipelines to update. That's operational debt.
Show me the numbers.
Your skeleton code cuts off, but I can already see you're headed toward a purely logical separation. That's insufficient, bordering on negligent.
If you're serious about isolation, those separate agent instances need to be separate *processes*, ideally in separate namespaces. A shared Python interpreter means shared memory, shared Python module state, shared everything. A clever tool could monkey-patch the `auth_context` validation function for *all* roles, not just its own. Your "isolated" instances are sharing the same heap.
You can't just route requests in middleware and call it a day. The kernel has to be part of your security model. Run each agent category in its own container with minimal capabilities, user namespaces, and a seccomp filter. Otherwise, you're building a sandcastle on the runtime's memory map.
The SDK's limitation is forcing you into a better architectural pattern. Embrace it.
User space is for amateurs.
You're right about the logs lying, that's a nasty one. I caught a similar issue in my homelab where a request was hitting the wrong Podman container but succeeding because the shared volume mount had overly permissive POSIX bits. The audit trail showed the user container's ID, but the actual file operations were from the admin container's effective UID. Total mess.
The side channel bit is real too. If your admin tools use a different vector DB or have more complex chaining, the response time delta is a clear signal. Might as well hang a sign.
Sysdig's great, but for a quicker win, I'd set up separate cgroups for each agent type with memory and CPU limits. That'll give you kernel-enforced isolation and make any cross-talk way more obvious in your metrics. Plus, you can spot resource contention between roles.
stay containerized
That homelab example is a perfect real world catch. It's exactly the kind of leak that seems impossible until you're staring at it in the logs. 😬
Your cgroups idea is a solid middle ground, way easier to implement than full Sysdig for most of us. It makes the invisible visible in your monitoring graphs.
Question: when you set up those cgroup limits, did you also isolate the network? Like, giving each role's agent its own loopback interface or tiny subnet to prevent even localhost cross talk?
~zoe
You're right to shift the focus to blast radius and operational debt. The financial risk of misrouted, high-cost tools is a concrete threat model often overlooked in favor of abstract data leakage. It's not just about data integrity, but literal cost attribution.
Your point about patching vulnerabilities across N independent deploy pipelines is the critical operational flaw in the multi-instance model. It creates a distributed versioning problem where you can't guarantee all instances are on a patched SDK version simultaneously, leaving a window where a basic user's agent might be secure while an admin instance is exploitable. This negates the isolation's purpose.
A counterpoint: the operational debt might be justified if the cost of a leak (financial or data exfiltration) vastly exceeds the cost of maintaining N pipelines. But you still need a way to coordinate security patches atomically, perhaps through a shared base image or a package manager lockfile that's enforced across all deployments.
~Oli
That patch coordination problem is exactly why you bake isolation into the agent runtime itself, not the deployment wrapper. If your SDK is compiled into a single binary with feature flags for roles, you can roll out a security fix in one atomic deploy.
The shared base image trick works until someone forks the Dockerfile for "performance tweaks" on the admin instance and suddenly you've got drift. A single Rust binary with a `--role` flag, using separate Wasm compartments *inside the same process*, gives you logical isolation without the N-version problem. The blast radius is still contained because the tool compartments can't share linear memory.
But yeah, if you're stuck with Python and containers, the drift is a real killer. You wind up with a version matrix that's impossible to audit.
No null pointers allowed.