AutoGen’s two MCP integration paths both have blocking failures, and the framework is in maintenance mode
AutoGen’s two MCP integration paths both have blocking failures, and the framework is in maintenance mode
What you expect
AutoGen (microsoft/autogen, 55K stars) is a multi-agent framework where you install the package, connect MCP servers using the documented integration API, and build multi-agent workflows. Microsoft’s documentation presents it as the current recommended framework for building production agentic systems.
What actually happens
AutoGen is a 4-way fragmented ecosystem with active package naming collisions, two incompatible MCP integration surfaces that each have blocking failures, and a maintenance mode announcement that most builders have not seen.
The package you install is not the package you expect
pip install autogen installs the AG2 community fork (ag2-ai/ag2, 4.2K stars), not Microsoft’s AutoGen 0.4. Microsoft’s current version requires pip install autogen-agentchat.
The four surfaces in the ecosystem:
| Surface | Status | Install command |
|---|---|---|
| AutoGen 0.4 | Current Microsoft version | pip install autogen-agentchat |
| AutoGen 0.2 legacy | Superseded | pip install pyautogen (now Microsoft’s — reclaimed July 2025) |
| AG2 fork | Community fork, active | pip install autogen or pip install ag2 |
| Semantic Kernel | Microsoft enterprise path | Via SK packages |
The pyautogen name was reclaimed by Microsoft in July 2025 — it now installs autogen-agentchat, not AG2. Any codebase that pinned pyautogen for AG2 before July 2025 will silently pull Microsoft’s incompatible package on a fresh install. The autogen name remains the AG2 collision point.
Microsoft placed AutoGen in maintenance mode (October 2025)
Microsoft’s migration guide confirms: AutoGen 0.4 received its last release in September 2025 (v0.7.5). Bug fixes and security patches only — no new features. Microsoft recommends transitioning to Microsoft Agent Framework within 6-12 months. 637 open issues as of March 2026.
MCP integration has two surfaces, both broken in different ways
AutoGen offers two MCP integration paths:
| Surface | Schema handling | Windows/Jupyter |
|---|---|---|
mcp_server_tools() | Crashes on $ref/$defs schemas (Issue #7129) | Works |
McpWorkbench | Handles $ref/$defs correctly | Infinite loop (Issue #6534) |
$ref/$defs patterns appear in any MCP tool schema with nested or recursive types — they are not edge cases. mcp_server_tools() crashes as soon as you connect a non-trivial MCP server. Switching to McpWorkbench fixes schema handling but breaks Windows/Jupyter environments due to asyncio’s missing _make_subprocess_transport. There is no single path that works across all inputs and all platforms.
Speaker selection is non-deterministic in production
speaker_selection_method="auto" is unstable under real conditions. A documented production case: GroupChatManager skipped the critic agent across multiple runs, then looped back to the researcher agent three consecutive times without deterministic cause (Issue #7275).
Switching to round_robin eliminates the instability but removes the LLM-based coordination that is AutoGen’s core value proposition.
No contract tests exist for termination behavior — it varies with timing and tool-response ordering.
Observability gap: no per-call traces without monkey-patching
AutoGen emits only top-level OTel spans. During multi-step tool loops, there is no per-call visibility.
Getting per-call traces requires monkey-patching three levels into private internals (confirmed via Langfuse Issue #11505). You see that a workflow started and finished; you cannot see what happened between those points via standard observability tooling.
Security defaults are permissive
LocalCommandLineCodeExecutor is explicitly insecure (v0.7.5, Sept 2025). AutoGen v0.7.5 added warnings and made DockerCommandLineCodeExecutor the documented recommended default. LocalCommandLineCodeExecutor runs code directly on the host without sandboxing.
MCP security defaults have no fail-closed mode for untrusted servers (Issue #7266). Malformed or malicious tool responses are processed without validation.
What this means for you
If you are evaluating AutoGen today: you are evaluating a deprecated framework. Microsoft’s own migration timeline is 6-12 months. Multi-agent systems require a new orchestration model in the target framework, not just refactoring.
If you are already using AutoGen with MCP: your MCP integration path has a blocking failure depending on your server’s schema and your platform. There is no upstream fix in the pipeline because the framework is in maintenance mode.
If you installed autogen from PyPI: you have the AG2 community fork, which has its own breaking changes (temperature and top_p cannot be set simultaneously, breaking existing llm_config objects — not documented in release notes) and separate issues from Microsoft’s version.
The observability gap means debugging multi-agent failures requires accepting partial visibility. If your multi-agent workflow produces wrong results, you cannot trace the cause through standard monitoring without invasive monkey-patching.
What to do
-
Verify your installed package. Run
python -c "import autogen; print(autogen.__version__, autogen.__file__)"— if the path points to anag2directory, you have the community fork, not Microsoft’s. -
For new projects: Evaluate LangGraph or Microsoft Agent Framework instead of AutoGen 0.4. AutoGen 0.4’s maintenance mode means MCP spec evolution (post-Linux Foundation move) will not be reflected in the framework.
-
For existing AutoGen MCP integrations:
- Test your MCP server schemas for
$ref/$defspatterns before choosing betweenmcp_server_tools()andMcpWorkbench. - If on Windows/Jupyter:
mcp_server_tools()is the only viable path, with the schema limitation. - If on Linux/macOS with non-trivial schemas:
McpWorkbenchis required.
- Test your MCP server schemas for
-
For speaker selection: use
round_robinfor any workflow where agent execution order is meaningful. Do not useautoin production unless you have tested termination behavior across 50+ runs with your specific tool configuration. -
Replace
LocalCommandLineCodeExecutorwithDockerCommandLineCodeExecutorin all existing deployments that run user-controlled or LLM-generated code.
Falsification criterion: This finding would be disproved by a new AutoGen release (>v0.7.5) that exits maintenance mode, patches both MCP integration surfaces (schema handling and Windows asyncio), and ships deterministic termination contract tests.
Evidence
| Tool | Version | Evidence | Result |
|---|---|---|---|
| microsoft/autogen | v0.7.5 (Sept 2025) | source-reviewed | Maintenance mode confirmed from Microsoft migration guide; last release Sept 2025, 637 open issues |
| AutoGen Issue #7129 | v0.7.5 | source-reviewed | mcp_server_tools() crashes on MCP tool schemas with $ref/$defs |
| AutoGen Issue #6534 | v0.7.5 | source-reviewed | McpWorkbench infinite loop on Windows/Jupyter (asyncio missing _make_subprocess_transport) |
| AutoGen Issue #7275 | v0.7.5 | source-reviewed | Termination non-determinism; no contract tests; speaker_selection_method=auto skips/repeats agents |
| AutoGen Issue #7266 | v0.7.5 | source-reviewed | Permissive MCP security defaults; no fail-closed mode for untrusted servers |
| PyPI autogen | v0.12.1 (Apr 2026) | independently-confirmed | pip install autogen installs AG2 fork (ag2-ai/ag2), not microsoft/autogen |
| PyPI pyautogen | reclaimed July 2025 | independently-confirmed | Microsoft reclaimed pyautogen; now installs autogen-agentchat |
| Microsoft migration guide | Oct 2025 | source-reviewed | AutoGen in maintenance mode; 6-12 month migration window recommended |
| Langfuse Issue #11505 | — | source-reviewed | Per-call OTel traces require monkey-patching 3 levels into private AutoGen internals |
Confidence: empirical — 9 sources reviewed. PyPI autogen and PyPI pyautogen independently confirm the naming collision; Microsoft’s migration guide independently confirms maintenance mode.
Strongest case against: AutoGen 0.4 has 55K stars and production deployments at scale. Magentic-UI actively builds on the 0.4 architecture. The MCP issue tracker bugs are open but not confirmed as blockers for all server types — builders whose MCP servers do not use $ref/$defs will not hit Issue #7129. The maintenance mode announcement is from October 2025; continued security patch releases mean it remains deployable for security-sensitive use cases. Microsoft Agent Framework is less mature and less documented than AutoGen 0.4, so the migration path carries its own risk.
Open questions: Whether the $ref/$defs crash is present in all AutoGen 0.4 versions or was introduced at a specific patch level. Whether the Windows asyncio issue in McpWorkbench was present in all 0.4 releases or is a regression. Whether Microsoft Agent Framework has reached feature parity with AutoGen’s GroupChat pattern as of May 2026.
Seen different? Contribute your evidence — share a repro or counter-example and we’ll review it against this finding. Reader evidence is what keeps these findings accurate.