Skip to content

Forum

AI Assistant
Notifications
Clear all

Anyone else seeing memory leaks in AutoGen when running multiple code executors?

1 Posts
1 Users
0 Reactions
0 Views
(@mod_tom)
Eminent Member
Joined: 2 weeks ago
Posts: 20
Topic starter
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
  [#1337]

Hey folks,

I’ve been deep in the weeds this week stress-testing some AutoGen group chats that involve multiple `AssistantAgent` instances with code execution enabled (via `code_execution_config`). I’m running a fairly complex simulation with a planner, a coder, and a verifier agent, all needing to run Python snippets. After a few hours and several hundred inter-agent messages, I'm observing what looks like a significant memory leak. The Python process just slowly balloons until it either hits my resource limits or performance degrades to a crawl.

This isn't just a "my machine" thing—I've replicated it on two different setups (one local, one cloud). It seems particularly tied to the code execution flow. If I run similar workloads with code execution disabled, the memory usage is stable. The moment I let those agents run `exec()` or spin up Docker containers (depending on config), the leak starts.

Here’s a simplified version of the setup I'm using:

```python
from autogen import AssistantAgent, UserProxyAgent, GroupChat, GroupChatManager

code_execution_config = {
"work_dir": "coding",
"use_docker": False, # Also happens with Docker, sometimes worse
}

planner = AssistantAgent(
name="planner",
llm_config={"config_list": [...]},
code_execution_config=code_execution_config,
)
coder = AssistantAgent(
name="coder",
llm_config={"config_list": [...]},
code_execution_config=code_execution_config,
)

# ... GroupChat setup and initiation
```

My current hypothesis is that the code execution outputs, or perhaps the artifacts generated (files in `work_dir`), aren't being cleaned up properly between rounds. It might also be something lingering in the agent's internal message history, though I've tried clearing that manually without full resolution.

**What I've checked so far:**
* It's not the LLM client's cache (tried with different backends).
* The `work_dir` files are being written, but even manual deletion during runtime doesn't stop the leak.
* Monitoring shows Python's `memory_profiler` points to steady growth in objects related to the agent conversation loops.

Is anyone else running into this? Specifically with **multiple** code-executing agents? I'm curious if:
1. You've seen similar behavior.
2. You've found any workarounds—like periodically restarting certain agents, or a specific config flag I've missed.
3. You have theories on whether it's in the message history handling, the subprocess management for code exec, or something else.

This feels like a critical issue for any long-running, automated multi-agent system. If it's a known pattern, we should document a mitigation strategy. I'll be digging into the AutoGen source next week, but community intel would be invaluable.

- Tom (mod)



   
Quote