Getting Started with AutoGen: Build Your First Multi-Agent System

Q: Is it safe to let agents execute code automatically?

AutoGen's UserProxyAgent executes code in a subprocess on your machine by default. For production use, always set "usedocker": True in codeexecutionconfig to sandbox execution inside a container. Never run humaninputmode="NEVER" with Docker disabled on tasks that could produce destructive file operations without reviewing the generated code first.

Q: How do I prevent agents from running indefinitely?

Set maxconsecutiveautoreply on UserProxyAgent to cap how many automated turns can occur without human input, and always define isterminationmsg so the chat stops cleanly. The maxround parameter in GroupChat provides a hard ceiling for multi-agent conversations.

Q: Can I add memory so agents remember previous conversations?

Out of the box, each initiatechat call starts with a fresh message history. For persistent memory, you can prepend a summary to the initial message, use AutoGen's ConversableAgent.chatmessages to extract history, or integrate a vector store so agents retrieve relevant past context before each session.

Q: What's the difference between `GroupChat` and a simple two-agent chat?

A two-agent chat (agenta.initiatechat(agentb, ...)) is a direct, alternating back-and-forth. GroupChat adds a GroupChatManager that uses an LLM to decide which agent speaks next, enabling non-linear, role-specialized conversations. Use two-agent chats for simple task delegation; use GroupChat when you need a Planner → Coder → Reviewer pipeline or any workflow with more than two roles.

If you’re ready for Getting Started with AutoGen: Build Your First Multi-Agent System, you’ve picked one of the most powerful open-source frameworks for orchestrating collaborative AI agents. Microsoft AutoGen lets you define multiple agents with distinct roles, have them converse with each other, and collectively solve tasks that a single LLM call simply cannot handle. In this guide you’ll go from zero to a working multi-agent system in under an hour.

What is Microsoft AutoGen?

Microsoft AutoGen is an open-source framework that enables you to build applications where multiple AI agents collaborate through structured conversations. Released by Microsoft Research, AutoGen sits in a different category from single-agent tools: instead of one assistant answering one question, you orchestrate a team of agents that plan, critique, code, and verify each other’s outputs.

The key insight behind AutoGen is that complex tasks benefit from role specialization. A dedicated “Planner” agent can break a goal into sub-tasks, a “Coder” agent can write Python, an “Executor” agent can run it, and a “Critic” agent can review the result — all inside a single automated conversation loop.

If you’ve explored autonomous agent projects before, you may have already read about What Is AutoGPT? The Autonomous AI Agent Explained. AutoGen takes a more structured, conversation-centric approach compared to AutoGPT’s goal-pursuit loop, making it particularly well-suited for engineering and data workflows where auditability matters.

Why use AutoGen?

Agents can be powered by any OpenAI-compatible model (GPT-4o, Claude, local Ollama endpoints).
Human-in-the-loop is a first-class feature — you can pause automation at any point and ask for approval.
It ships with built-in code execution: agents can write Python and run it in a sandboxed subprocess automatically.
Group chat support lets you scale beyond two agents to an entire “team”.

Setting Up Your Development Environment

AutoGen requires Python 3.10 or higher. Create an isolated virtual environment before installing anything.

# Create and activate a virtual environment
python -m venv autogen-env
source autogen-env/bin/activate        # macOS / Linux
# autogen-env\Scripts\activate         # Windows

# Install AutoGen with the optional extras we'll need
pip install "pyautogen[openai]"

Verify the installation:

python -c "import autogen; print(autogen.__version__)"

Next, store your API key. AutoGen reads it from an environment variable or a config file. The config-file approach is safer for sharing code:

# llm_config.py  — never commit this file
import os

llm_config = {
    "model": "gpt-4o",
    "api_key": os.environ.get("OPENAI_API_KEY"),
    "temperature": 0.2,
}

Export your key in the shell:

export OPENAI_API_KEY="sk-..."

Understanding AutoGen’s Core Concepts

Before writing code, it pays to understand the three building blocks AutoGen provides.

ConversableAgent

ConversableAgent is the base class. Every agent in AutoGen inherits from it. You configure each agent with:

name — a unique identifier used in chat logs
system_message — the role prompt that shapes behavior
llm_config — which model to use (or False to skip LLM entirely)
human_input_mode — "NEVER", "TERMINATE", or "ALWAYS"

AssistantAgent and UserProxyAgent

AutoGen ships two ready-to-use subclasses:

AssistantAgent — an LLM-backed agent that generates text and code. human_input_mode defaults to "NEVER" so it runs fully automated.
UserProxyAgent — represents the human side of a conversation. It can execute code on your machine and optionally ask for real human input. Set human_input_mode="NEVER" to let it run autonomously.

Termination Conditions

Conversations end when an agent’s reply contains a termination string. The default is "TERMINATE" — any message that includes that word stops the chat. You can customize this with the is_termination_msg parameter.

Building Your First Multi-Agent System

Let’s build a two-agent system where an AssistantAgent writes Python code and a UserProxyAgent executes it automatically.

Goal: Ask the pair to calculate the first 20 Fibonacci numbers and plot them.

# first_agents.py
import autogen
import os

# --- LLM configuration ---
llm_config = {
    "model": "gpt-4o",
    "api_key": os.environ.get("OPENAI_API_KEY"),
    "temperature": 0.1,
}

# --- Define the assistant ---
assistant = autogen.AssistantAgent(
    name="Coder",
    system_message=(
        "You are a helpful Python programmer. "
        "Write clean, complete, and runnable Python scripts. "
        "When the task is done, end your reply with TERMINATE."
    ),
    llm_config=llm_config,
)

# --- Define the user proxy (code executor) ---
user_proxy = autogen.UserProxyAgent(
    name="Executor",
    human_input_mode="NEVER",          # fully automated
    max_consecutive_auto_reply=5,      # safety limit
    is_termination_msg=lambda msg: "TERMINATE" in msg.get("content", ""),
    code_execution_config={
        "work_dir": "coding_workspace",  # code runs here
        "use_docker": False,             # set True for sandboxed execution
    },
)

# --- Start the conversation ---
user_proxy.initiate_chat(
    assistant,
    message=(
        "Calculate the first 20 Fibonacci numbers and save a bar chart "
        "of them to 'fibonacci.png'. Print the list of numbers too."
    ),
)

Run it:

python first_agents.py

What happens under the hood:

Executor sends the task to Coder.
Coder replies with a Python script inside a code block.
Executor detects the code block, runs it in coding_workspace/, and sends the stdout/stderr back.
If there’s an error, Coder debugs and retries automatically.
Once the chart is saved, Coder replies with TERMINATE and the chat ends.

Enhancing Agents with Tools and Skills

A two-agent pair is powerful, but AutoGen really shines when you add a group chat with specialized roles. Let’s extend the example to a three-agent system that plans, codes, and reviews.

# multi_agent_team.py
import autogen
import os

llm_config = {
    "model": "gpt-4o",
    "api_key": os.environ.get("OPENAI_API_KEY"),
    "temperature": 0.2,
}

# --- Specialized agents ---
planner = autogen.AssistantAgent(
    name="Planner",
    system_message=(
        "You are a project planner. Break the user's task into clear steps. "
        "Do not write code yourself. Hand off each step to the Coder. "
        "When all steps are complete, summarize and say TERMINATE."
    ),
    llm_config=llm_config,
)

coder = autogen.AssistantAgent(
    name="Coder",
    system_message=(
        "You are a Python expert. Implement exactly what the Planner requests. "
        "Always include complete, self-contained code blocks."
    ),
    llm_config=llm_config,
)

critic = autogen.AssistantAgent(
    name="Critic",
    system_message=(
        "You are a code reviewer. After the Coder writes code, check for bugs, "
        "edge cases, and style issues. Approve good code or request fixes."
    ),
    llm_config=llm_config,
)

executor = autogen.UserProxyAgent(
    name="Executor",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=8,
    is_termination_msg=lambda msg: "TERMINATE" in msg.get("content", ""),
    code_execution_config={
        "work_dir": "team_workspace",
        "use_docker": False,
    },
)

# --- Group chat wires them together ---
group_chat = autogen.GroupChat(
    agents=[executor, planner, coder, critic],
    messages=[],
    max_round=20,
    speaker_selection_method="auto",  # AutoGen decides who speaks next
)

manager = autogen.GroupChatManager(
    groupchat=group_chat,
    llm_config=llm_config,
)

# --- Kick off the team ---
executor.initiate_chat(
    manager,
    message=(
        "Build a Python CLI tool that fetches the current Bitcoin price "
        "from a public API and displays it with color-coded output "
        "(green if up from yesterday, red if down)."
    ),
)

Registering Custom Tools

You can give any agent a callable tool using the @register_for_llm / @register_for_execution decorator pattern:

from autogen import AssistantAgent, UserProxyAgent, register_function
import requests
import os

llm_config = {
    "model": "gpt-4o",
    "api_key": os.environ.get("OPENAI_API_KEY"),
}

def get_bitcoin_price() -> dict:
    """Fetch the current BTC/USD price from CoinGecko."""
    url = "https://api.coingecko.com/api/v3/simple/price"
    params = {"ids": "bitcoin", "vs_currencies": "usd", "include_24hr_change": "true"}
    response = requests.get(url, params=params, timeout=10)
    response.raise_for_status()
    return response.json()["bitcoin"]

assistant = AssistantAgent(
    name="CryptoAssistant",
    system_message="You help users track cryptocurrency prices. Use the provided tool.",
    llm_config=llm_config,
)

user_proxy = UserProxyAgent(
    name="User",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=3,
    code_execution_config=False,  # no code execution needed here
)

# Register the function so the LLM can call it and the proxy can execute it
register_function(
    get_bitcoin_price,
    caller=assistant,
    executor=user_proxy,
    name="get_bitcoin_price",
    description="Returns the current Bitcoin price in USD and its 24-hour change percentage.",
)

user_proxy.initiate_chat(
    assistant,
    message="What is Bitcoin's current price and how has it moved in the last 24 hours?",
)

This pattern keeps tool execution safely in the UserProxyAgent while the AssistantAgent decides when and how to call it — a clean separation of reasoning from side effects.

Comparing AutoGen to Other Frameworks

If you’re evaluating frameworks, AutoGen’s conversation-centric model contrasts with Getting Started with CrewAI: Multi-Agent Workflows in Python. CrewAI structures agents around “tasks” and “crews” with a more declarative API, while AutoGen gives you finer control over the conversation flow — a difference that matters on complex, conditional pipelines.

For persistence and retrieval, you can pair AutoGen agents with a vector store. The patterns covered in Build a RAG Pipeline with LangChain and Pinecone apply equally well when you want agents to query a knowledge base before answering.

Frequently Asked Questions

Does AutoGen require OpenAI? Can I use local models?

No, AutoGen is not locked to OpenAI. Any OpenAI-compatible endpoint works. To use a local model via Ollama, set "base_url": "http://localhost:11434/v1" and "api_key": "ollama" in your llm_config. Supported local models include LLaMA 3, Mistral, and Phi-3, though code-generation quality varies by model size.

Is it safe to let agents execute code automatically?

AutoGen’s UserProxyAgent executes code in a subprocess on your machine by default. For production use, always set "use_docker": True in code_execution_config to sandbox execution inside a container. Never run human_input_mode="NEVER" with Docker disabled on tasks that could produce destructive file operations without reviewing the generated code first.

How do I prevent agents from running indefinitely?

Set max_consecutive_auto_reply on UserProxyAgent to cap how many automated turns can occur without human input, and always define is_termination_msg so the chat stops cleanly. The max_round parameter in GroupChat provides a hard ceiling for multi-agent conversations.

Can I add memory so agents remember previous conversations?

Out of the box, each initiate_chat call starts with a fresh message history. For persistent memory, you can prepend a summary to the initial message, use AutoGen’s ConversableAgent.chat_messages to extract history, or integrate a vector store so agents retrieve relevant past context before each session.

What’s the difference between `GroupChat` and a simple two-agent chat?

A two-agent chat (agent_a.initiate_chat(agent_b, ...)) is a direct, alternating back-and-forth. GroupChat adds a GroupChatManager that uses an LLM to decide which agent speaks next, enabling non-linear, role-specialized conversations. Use two-agent chats for simple task delegation; use GroupChat when you need a Planner → Coder → Reviewer pipeline or any workflow with more than two roles.