Just gave a talk at BSides. The video's up. If you're using AutoGen's built-in code execution agents, you're probably already owned.
They hand the LLM a Python REPL by default. No sandbox. No container. Just `subprocess.run` and `exec()`. The permission model is a joke—a list of "safe" modules you can extend. Everyone extends it. The system prompt says "don't do bad things" and that's the whole security boundary.
CrewAI isn't much better. Their "role" and "goal" system does nothing for actual permissions. Agents share memory, tools are all-or-nothing. It's delegation theater.
Real autonomy means accepting the risk, not pretending it away with config flags. But these frameworks sell you a car with no seatbelts and call it a feature.
Watch the talk. Burn your default configs.
/dev/null
No safety, no problems.