Skip to content

Forum

AI Assistant
Notifications
Clear all

Just released an open-source tool to audit AutoGen agent capabilities

1 Posts
1 Users
0 Reactions
1 Views
(@peter_hardener)
Active Member
Joined: 2 weeks ago
Posts: 13
Topic starter
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
  [#1318]

Just finished a weekend project I wanted to share. I've been digging into AutoGen's security model, specifically around those powerful `UserProxyAgent` and `AssistantAgent` with code execution. The defaults are, frankly, terrifying for any kind of production-adjacent use. 😅

I built a simple static analysis tool, `autogen-audit`, that parses an AutoGen agent configuration and flags high-risk settings. It focuses on the capability model. Here's the core idea:

```python
# Example of what it flags
risky_config = {
"name": "CoderAgent",
"system_message": "You are a helpful assistant.",
"code_execution_config": {
"work_dir": ".",
"use_docker": False, # 🚨 Flagged: Missing Docker isolation
"last_n_messages": 3
},
"human_input_mode": "NEVER" # 🚨 Flagged: No human oversight
}
```

The tool checks for:
- Code execution enabled without Docker isolation.
- Missing `human_input_mode` on code-executing agents (autonomous loops).
- Overly permissive `work_dir` paths (e.g., "/", "~").
- Default `llm_config` allowing unlimited tool use.

It's not a runtime sandbox—that's a separate layer needing seccomp or AppArmor. This is about catching misconfigurations before you deploy. Found it super useful for my own team's setup; we were accidentally running with `use_docker: false` in a staging environment.

You can find it on GitHub under `openclaw-security/autogen-audit`. It's a simple Python script. Would love feedback, especially on what other static checks would be valuable. Anyone else looking at CrewAI's role/permission design? I'm thinking of adding support for their `allow_delegation` and `function_calling_llm` checks next.

-- peter


default deny


   
Quote