Mastering AutoGen Group Chat for Collaborative AI Workflows is one of the most powerful techniques you can adopt when building multi-agent systems that require coordination, debate, and shared decision-making. Unlike simple two-agent pipelines, AutoGen’s group chat architecture lets multiple specialized agents collaborate within a shared conversation, delegating tasks dynamically and producing results no single agent could achieve alone. This guide walks you through everything you need to know — from the core architecture to advanced speaker selection and nested chat patterns.
Understanding GroupChatManager and Agents
At the heart of AutoGen’s multi-agent collaboration sits two key components: the GroupChat object and the GroupChatManager.
GroupChatis a data container. It holds the list of participating agents, conversation history, speaker-selection strategy, and configuration like maximum rounds.GroupChatManageris itself aConversableAgentthat acts as the conversation orchestrator. It reads the current chat state, applies the speaker selection logic, and routes messages to the next agent.
Every agent in the group is a standard ConversableAgent (or a subclass like AssistantAgent or UserProxyAgent). Each agent has its own system prompt defining its role, its own LLM configuration, and optionally its own tool set.
User ──► GroupChatManager
│
┌─────────┼─────────┐
▼ ▼ ▼
Planner Coder Reviewer
The manager does not answer questions itself — it only decides who speaks next. This separation of concerns is what makes the architecture composable and scalable.
If you are coming from LangChain or CrewAI, the mental model is similar to a supervisor agent, but AutoGen makes the routing logic explicit and overridable. For a comparison of agent orchestration approaches, see Getting Started with CrewAI: Multi-Agent Workflows in Python.
Building Your First Multi-Agent Chat
The following example creates a three-agent group: a Planner, a Coder, and a Critic. Install dependencies first:
pip install pyautogen openai
Set your API key:
export OPENAI_API_KEY="sk-..."
Now build the group chat:
import autogen
# Shared LLM config
llm_config = {
"model": "gpt-4o",
"api_key": "YOUR_API_KEY", # or use os.environ
"temperature": 0.1,
}
# Define agents
planner = autogen.AssistantAgent(
name="Planner",
system_message=(
"You are a planning expert. Break down the user's request into "
"clear, numbered steps. Do not write code — that is the Coder's job."
),
llm_config=llm_config,
)
coder = autogen.AssistantAgent(
name="Coder",
system_message=(
"You are a Python expert. Implement the plan provided by the Planner. "
"Always wrap code in ```python blocks. "
"Make code complete and runnable."
),
llm_config=llm_config,
)
critic = autogen.AssistantAgent(
name="Critic",
system_message=(
"You are a code reviewer. Review the Coder's output for bugs, "
"edge cases, and clarity. Suggest specific improvements."
),
llm_config=llm_config,
)
# Human proxy to initiate conversation
user_proxy = autogen.UserProxyAgent(
name="User",
human_input_mode="NEVER", # fully automated
code_execution_config=False,
max_consecutive_auto_reply=0,
)
# Wire up group chat
group_chat = autogen.GroupChat(
agents=[user_proxy, planner, coder, critic],
messages=[],
max_round=12,
speaker_selection_method="auto", # LLM-driven selection
)
manager = autogen.GroupChatManager(
groupchat=group_chat,
llm_config=llm_config,
)
# Kick off the conversation
user_proxy.initiate_chat(
manager,
message="Build a Python function that fetches the top 5 Hacker News stories using the public API.",
)
When you run this, the manager reads the conversation history and decides that the Planner should speak first, then the Coder, then the Critic — cycling until max_round is reached or an agent replies with TERMINATE.
The TERMINATE signal is the standard AutoGen exit convention. Any agent can end the loop by including TERMINATE in its response. Add this to the Critic’s prompt to stop after approval:
critic = autogen.AssistantAgent(
name="Critic",
system_message=(
"Review the Coder's output. If it looks correct and complete, "
"reply with: 'Looks good. TERMINATE'. "
"Otherwise, list specific improvements."
),
llm_config=llm_config,
)
Customizing Agent Speaker Selection
The default "auto" speaker selection asks the manager’s LLM to choose who speaks next based on context. This works well for small groups but can become unpredictable with 5+ agents. AutoGen offers three strategies:
| Strategy | Behavior |
|---|---|
"auto" | LLM decides based on conversation context |
"round_robin" | Strict rotation through agent list |
"random" | Random selection each round |
| Custom callable | Your Python function decides |
Round-Robin for Predictable Pipelines
When your workflow has a fixed sequence (plan → code → review → deploy), round-robin is more reliable than LLM selection:
group_chat = autogen.GroupChat(
agents=[user_proxy, planner, coder, critic],
messages=[],
max_round=9,
speaker_selection_method="round_robin",
)
Agents will speak in the order they appear in the agents list, cycling back after the last agent.
Custom Selection with a Callable
For fine-grained control, pass a Python function. The function receives the current GroupChat instance and returns the next agent:
def custom_selector(last_speaker: autogen.Agent, groupchat: autogen.GroupChat) -> autogen.Agent:
"""Route based on last speaker identity."""
agents_by_name = {a.name: a for a in groupchat.agents}
if last_speaker.name == "User":
return agents_by_name["Planner"]
elif last_speaker.name == "Planner":
return agents_by_name["Coder"]
elif last_speaker.name == "Coder":
return agents_by_name["Critic"]
else:
# Critic loops back to Coder unless TERMINATE was sent
last_msg = groupchat.messages[-1]["content"]
if "TERMINATE" in last_msg:
return agents_by_name["User"] # ends the loop
return agents_by_name["Coder"]
group_chat = autogen.GroupChat(
agents=[user_proxy, planner, coder, critic],
messages=[],
max_round=15,
speaker_selection_method=custom_selector,
)
Custom selection is particularly useful when you want to gate progress — for example, only moving from Coder to Critic once the code block contains no syntax errors.
Allowed Speaker Transitions
AutoGen 0.4+ supports allowed_or_disallowed_speaker_transitions, a dictionary mapping each agent to the set of agents it can hand off to:
transitions = {
user_proxy: [planner],
planner: [coder],
coder: [critic],
critic: [coder, user_proxy], # can send back to coder or end
}
group_chat = autogen.GroupChat(
agents=[user_proxy, planner, coder, critic],
messages=[],
max_round=15,
speaker_selection_method="auto",
allowed_or_disallowed_speaker_transitions=transitions,
speaker_transitions_type="allowed",
)
This enforces your workflow topology at the framework level, catching routing errors before they reach the LLM.
Managing Complex Workflows with Nested Chats
When a single group chat grows too large (many agents, long histories, high token cost), nested chats let you decompose the work into sub-conversations. An outer group chat handles high-level coordination, while inner chats tackle focused subtasks.
AutoGen supports nested chats through register_nested_chats, which binds a trigger condition to a sub-workflow:
import autogen
llm_config = {"model": "gpt-4o", "api_key": "YOUR_API_KEY"}
# --- Inner chat: specialized code-and-test loop ---
inner_coder = autogen.AssistantAgent(
name="InnerCoder",
system_message="Write clean Python code for the given task.",
llm_config=llm_config,
)
inner_tester = autogen.AssistantAgent(
name="InnerTester",
system_message=(
"Write pytest unit tests for the code. "
"If tests pass conceptually, reply TERMINATE."
),
llm_config=llm_config,
)
inner_proxy = autogen.UserProxyAgent(
name="InnerProxy",
human_input_mode="NEVER",
code_execution_config={"work_dir": "coding", "use_docker": False},
max_consecutive_auto_reply=5,
)
inner_group = autogen.GroupChat(
agents=[inner_proxy, inner_coder, inner_tester],
messages=[],
max_round=8,
)
inner_manager = autogen.GroupChatManager(
groupchat=inner_group,
llm_config=llm_config,
)
# --- Outer chat: high-level coordination ---
coordinator = autogen.AssistantAgent(
name="Coordinator",
system_message=(
"You coordinate the project. When you need code written, "
"output exactly: 'DELEGATE_CODE: <task description>'. "
"Otherwise synthesize results and reply TERMINATE when done."
),
llm_config=llm_config,
)
outer_proxy = autogen.UserProxyAgent(
name="OuterUser",
human_input_mode="NEVER",
code_execution_config=False,
max_consecutive_auto_reply=3,
)
# Register nested chat trigger
def trigger_nested(msg):
return "DELEGATE_CODE:" in msg.get("content", "")
coordinator.register_nested_chats(
[
{
"recipient": inner_manager,
"message": lambda sender, recipient, ctx: ctx["carryover"],
"summary_method": "last_msg",
"max_turns": 1,
}
],
trigger=trigger_nested,
)
# Start outer workflow
outer_proxy.initiate_chat(
coordinator,
message="Build and test a function that converts Celsius to Fahrenheit.",
)
Key concepts here:
register_nested_chatsaccepts a list of chat configs and a trigger — a callable that returnsTruewhen the nested chat should fire.summary_methodcontrols how the nested chat’s result is injected back into the outer conversation."last_msg"injects the final message;"reflection_with_llm"uses an LLM to summarize.carryoverpasses context from the outer to the inner chat automatically.
Nested chats are ideal for workflows where the outer agents make strategic decisions and the inner agents handle implementation details — keeping context windows manageable and token usage predictable.
For broader comparisons of frameworks that support agentic orchestration, check out LangChain vs LlamaIndex: Which Should You Use in 2026?, which covers how different ecosystems handle multi-step agent flows.
Production Tips
- Set
max_roundconservatively. Start with 10–12 and increase only if needed. Runaway loops are expensive. - Log all messages. Attach a
GroupChatand readgroup_chat.messagesafter the run to audit what happened. - Use
cache_seedinllm_configduring development to avoid duplicate API calls:"cache_seed": 42. - Token budgeting. Each agent in a group receives the full conversation history. With 5 agents and 20 rounds, you can quickly exhaust a context window. Use nested chats to limit scope.
# Enable caching for dev
llm_config = {
"model": "gpt-4o",
"api_key": "YOUR_API_KEY",
"cache_seed": 42, # deterministic for debugging
}
Frequently Asked Questions
How is GroupChatManager different from a regular AssistantAgent?
GroupChatManager is a specialized ConversableAgent whose only job is orchestration — it reads the group’s conversation history and decides which agent speaks next. It does not contribute domain knowledge or write code. A regular AssistantAgent has a system prompt that makes it answer questions or perform tasks. You should never give GroupChatManager a custom system prompt, as this can corrupt its routing behavior.
Can I mix agents with different LLM backends in the same group?
Yes. Each agent has its own llm_config, so you can have one agent backed by GPT-4o and another by Claude claude-sonnet-4-6 within the same GroupChat. The GroupChatManager itself needs an LLM config only when using speaker_selection_method="auto". When using "round_robin" or a custom callable, the manager needs no LLM.
What happens when max_round is reached before TERMINATE is sent?
AutoGen simply stops the conversation and returns control to the initiating agent. No exception is raised. You can check group_chat.messages to inspect the final state. If work is incomplete, increase max_round or add explicit TERMINATE logic to your agent prompts.
How do I give a specific agent access to tools (function calling) in a group chat?
Register tools on the individual agent using register_for_llm and register_for_execution, exactly as you would in a two-agent setup. The group chat mechanics are transparent to tool registration — the agent will call its tools when it speaks, and results are injected back into the shared conversation:
@coder.register_for_llm(description="Run a shell command and return output.")
@user_proxy.register_for_execution()
def run_shell(command: str) -> str:
import subprocess
result = subprocess.run(command, shell=True, capture_output=True, text=True)
return result.stdout or result.stderr
Is AutoGen’s group chat suitable for production workloads?
AutoGen group chat works well for medium-complexity pipelines where agents have clear roles and the number of rounds is bounded. For long-running production workloads, consider adding persistence (save/resume group_chat.messages), structured logging, and retry logic around LLM calls. The nested chat pattern described above also helps by keeping individual sub-conversations short and focused, reducing the risk of context overflow in high-throughput scenarios.