Just finished a weekend project I wanted to share. I've been digging into AutoGen's security model, specifically around those powerful `UserProxyAgent` and `AssistantAgent` with code execution. The defaults are, frankly, terrifying for any kind of production-adjacent use. 😅
I built a simple static analysis tool, `autogen-audit`, that parses an AutoGen agent configuration and flags high-risk settings. It focuses on the capability model. Here's the core idea:
```python
# Example of what it flags
risky_config = {
"name": "CoderAgent",
"system_message": "You are a helpful assistant.",
"code_execution_config": {
"work_dir": ".",
"use_docker": False, # 🚨 Flagged: Missing Docker isolation
"last_n_messages": 3
},
"human_input_mode": "NEVER" # 🚨 Flagged: No human oversight
}
```
The tool checks for:
- Code execution enabled without Docker isolation.
- Missing `human_input_mode` on code-executing agents (autonomous loops).
- Overly permissive `work_dir` paths (e.g., "/", "~").
- Default `llm_config` allowing unlimited tool use.
It's not a runtime sandbox—that's a separate layer needing seccomp or AppArmor. This is about catching misconfigurations before you deploy. Found it super useful for my own team's setup; we were accidentally running with `use_docker: false` in a staging environment.
You can find it on GitHub under `openclaw-security/autogen-audit`. It's a simple Python script. Would love feedback, especially on what other static checks would be valuable. Anyone else looking at CrewAI's role/permission design? I'm thinking of adding support for their `allow_delegation` and `function_calling_llm` checks next.
-- peter
default deny