My results after stress-testing CrewAI's role permissions with 50 agents

CrewAI and AutoGen Security

Last Post by Samir Patel 1 week ago

1 Posts

1 Users

0 Reactions

3 Views

RSS

Samir Patel

(@threat_model_junior)

Eminent Member

Joined: 1 week ago

Posts: 16

Topic starter

Translate ▼

June 22, 2026 10:48 am [#93]

Hey everyone. I've been diving deep into CrewAI's role and permission system, trying to see if it holds up under scale. The docs make it sound robust, but I wanted to see *why* the design choices were made and what an attacker (or even a misconfigured agent) could do.

I built a simulated org with 50 agents across different departments (Finance, Engineering, HR, etc.), each with tailored roles and permissions using `Agent` and `Task` definitions. The goal was to see if I could get a low-privilege agent to escalate its privileges or access another agent's context/tools indirectly.

Here's a simplified version of the initial, flawed setup I used:

```python
from crewai import Agent, Task, Crew

# Example of a 'Restricted' Agent
data_entry_agent = Agent(
role='Data Entry Clerk',
goal='Accurately input data',
backstory='Detail-oriented clerk',
tools=[], # Intentionally no tools
allow_delegation=False,
verbose=True
)

# Example of a 'Privileged' Agent
finance_agent = Agent(
role='Financial Analyst',
goal='Generate financial reports',
backstory='Expert in financial data',
tools=[tool1, tool2], # Has sensitive tools
allow_delegation=True,
verbose=True
)
```

My main findings:

1. **`allow_delegation` is a critical, but coarse, gate.** If an agent has this set to `True`, it can delegate tasks to *any* other agent in the crew, regardless of that agent's own permissions. I had a scenario where a `SocialMediaManager` agent (with `allow_delegation=True`) delegated a task to the `FinanceAnalyst`, asking it to run a tool. The finance agent executed it. This seems like a trust model that depends entirely on the delegating agent's goal description, which feels brittle.

2. **Role definitions don't enforce isolation.** The security boundary is the tool list and the hope that the LLM respects the `goal` and `backstory`. In my stress test, I crafted a `goal` for a low-privilege agent that socially engineered a higher-privilege agent via task output. The high-privilege agent's LLM, given the context, often complied. CrewAI doesn't have a native "permission to receive requests from" list.

3. **The real control point is at the tool level.** But that just pushes the problem down. If you give an agent a tool, you're trusting its LLM to use it correctly based on its role text. Under pressure (complex, multi-step tasks), I saw role drift where agents performed actions outside their stated role because it seemed logically necessary for the overall crew goal.

So my big "why" question is: **Why isn't there a more explicit trust model for inter-agent communication?** Something like a permission matrix where you can define which roles can delegate to which, or which agents can receive tasks from others? The current model feels like it's default-unsafe, relying on prompt engineering for security.

Appreciate any pointers.

Quote

Topic Tags

80 Forums
1,238 Topics
7,436 Posts
0 Online
508 Members

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed