Anyone else seeing memory leaks in AutoGen when running multiple code executors?

CrewAI and AutoGen Security

Last Post by Tom Mod 2 hours ago

1 Posts

1 Users

0 Reactions

0 Views

RSS

Tom Mod

(@mod_tom)

Eminent Member

Joined: 2 weeks ago

Posts: 20

Topic starter

Translate ▼

July 3, 2026 3:00 pm [#1337]

Hey folks,

I’ve been deep in the weeds this week stress-testing some AutoGen group chats that involve multiple `AssistantAgent` instances with code execution enabled (via `code_execution_config`). I’m running a fairly complex simulation with a planner, a coder, and a verifier agent, all needing to run Python snippets. After a few hours and several hundred inter-agent messages, I'm observing what looks like a significant memory leak. The Python process just slowly balloons until it either hits my resource limits or performance degrades to a crawl.

This isn't just a "my machine" thing—I've replicated it on two different setups (one local, one cloud). It seems particularly tied to the code execution flow. If I run similar workloads with code execution disabled, the memory usage is stable. The moment I let those agents run `exec()` or spin up Docker containers (depending on config), the leak starts.

Here’s a simplified version of the setup I'm using:

```python
from autogen import AssistantAgent, UserProxyAgent, GroupChat, GroupChatManager

code_execution_config = {
"work_dir": "coding",
"use_docker": False, # Also happens with Docker, sometimes worse
}

planner = AssistantAgent(
name="planner",
llm_config={"config_list": [...]},
code_execution_config=code_execution_config,
)
coder = AssistantAgent(
name="coder",
llm_config={"config_list": [...]},
code_execution_config=code_execution_config,
)

# ... GroupChat setup and initiation
```

My current hypothesis is that the code execution outputs, or perhaps the artifacts generated (files in `work_dir`), aren't being cleaned up properly between rounds. It might also be something lingering in the agent's internal message history, though I've tried clearing that manually without full resolution.

**What I've checked so far:**
* It's not the LLM client's cache (tried with different backends).
* The `work_dir` files are being written, but even manual deletion during runtime doesn't stop the leak.
* Monitoring shows Python's `memory_profiler` points to steady growth in objects related to the agent conversation loops.

Is anyone else running into this? Specifically with **multiple** code-executing agents? I'm curious if:
1. You've seen similar behavior.
2. You've found any workarounds—like periodically restarting certain agents, or a specific config flag I've missed.
3. You have theories on whether it's in the message history handling, the subprocess management for code exec, or something else.

This feels like a critical issue for any long-running, automated multi-agent system. If it's a known pattern, we should document a mitigation strategy. I'll be digging into the AutoGen source next week, but community intel would be invaluable.

- Tom (mod)

Quote

Topic Tags

80 Forums
1,339 Topics
7,843 Posts
10 Online
508 Members

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed