Multi-agent systems

As LLM workflows grow, a single agent that has accumulated 30+ tools, a bloated system prompt, and multiple responsibilities starts to degrade: it calls the wrong tools, hallucinates, and exhausts its context window. The fix is to split the work across several agents that each specialize in one area, coordinated either by a “manager” agent or by peer-to-peer messaging.

Do you need one?

If your workflow lives in one domain with a focused set of tools (under 10-15), better prompting and context engineering will almost always beat the complexity of managing multiple agents. Try these first:

Structured outputs to reduce ambiguity
Better retrieval (improved chunking, more relevant datasets)
Tighter constraints and clearer role definitions

Multi-agent systems pay off when you have:

Agents needing 20+ tools that start calling the wrong ones
Cross-domain coordination (e.g. documentation writing + product building)
Context-window exhaustion you can’t solve with retrieval alone
Clear role-delegation boundaries

If you can’t clearly articulate why you need multiple agents, keep it simple with one. To measure and improve a system you do build, see Observability; to route work across providers or models, see Model routing.

Manager-worker pattern

Rig supports the manager-worker pattern out of the box: because an Agent implements Tool, you can add one agent to another as a tool. The manager then delegates subtasks to workers and aggregates their results.

graph TD
    A[User Request] --> B[Manager Agent]
    B --> C{Task Planning}
    C --> D[Decompose into Subtasks]
    D --> E[Worker 1]
    D --> F[Worker 2]
    D --> G[Worker 3]
    E --> K[Manager Agent<br/>Aggregation]
    F --> K
    G --> K
    K --> M[Final Response]
    style B fill:#E85102,color:#000A07
    style E fill:#27EAA6,color:#000A07
    style F fill:#27EAA6,color:#000A07
    style G fill:#27EAA6,color:#000A07
    style K fill:#E85102,color:#000A07

Here Alice manages Bob. Alice is built with .tool(bob), so the model can call Bob as a tool:

use rig::{
    client::{CompletionClient, ProviderClient},
    completion::Prompt,
    providers::openai,
};

/// Manager-worker pattern with two agents, Alice (manager) and Bob (worker).
/// This awaits multiple LLM responses sequentially, so it may take a while.
async fn manager_worker_agent() -> Result<(), Box<dyn std::error::Error>> {
    let openai_client = openai::Client::from_env()?;

    let bob = openai_client
        .agent("gpt-5.5")
        .name("Bob")
        .description("An employee who works in admin at FooBar Inc.")
        .preamble(
            "You are Bob, an employee in admin at FooBar Inc. Alice, your manager, \
             may ask you to do things. You need to do them.",
        )
        .build();

    let alice = openai_client
        .agent("gpt-5.5")
        .name("Alice")
        .description("A manager at FooBar Inc.")
        .preamble("You are a manager in the admin department at FooBar Inc. You manage Bob.")
        .tool(bob)
        .build();

    let res = alice
        .prompt("Ask Bob to write an email for you and let me know what he has written.")
        .await?;

    println!("Response: {res}");
    Ok(())
}

Under the hood, the model returns a tool call targeting Bob with a prompt it generates. Rig prompts Bob, feeds his response back to Alice, and Alice returns the final answer.

Swarm behaviour with the actor pattern

Rig doesn’t ship a swarm primitive, but you can build one with the actor pattern: each agent runs its own loop, receives messages from peers over a channel, and can be nudged by an external trigger. Only the Rig-specific pieces are shown below — wiring up the channels and run loop is standard Tokio.

Define the messages exchanged between agents and the actor that owns a Rig client:

use rig::providers::openai;
use rig::completion::{Prompt, PromptError};
use std::sync::Arc;
use tokio::sync::{mpsc, RwLock};

/// Messages exchanged between agents.
#[derive(Debug, Clone)]
enum AgentMessage {
    Task(String),
    Response(String, String), // (from_agent_id, content)
    Trigger(String),
    Shutdown,
}

struct AgentState {
    conversation_history: Vec<String>,
}

/// An actor-based autonomous agent. Each holds its own Rig client, an inbox,
/// and channels to its peers.
struct AutonomousAgent {
    id: String,
    client: openai::Client,
    state: Arc<RwLock<AgentState>>,
    inbox: mpsc::Receiver<AgentMessage>,
    peer_channels: Arc<RwLock<Vec<mpsc::Sender<AgentMessage>>>>,
}

The one method that actually talks to an LLM builds a Rig agent on demand and prompts it. This is where you’d attach tools, RAG context, or memory to each swarm participant:

impl AutonomousAgent {
    /// Process a task with an LLM. Add `.tool(...)` here to give the agent capabilities.
    async fn process_autonomous_task(&self, task: &str) -> Result<String, PromptError> {
        let agent = self
            .client
            .agent("gpt-5.5")
            .preamble(&format!(
                "Your name is {}. Process tasks autonomously and coordinate with other agents.",
                self.id
            ))
            .build();

        agent.prompt(task).await
    }

    /// Broadcast a message to every registered peer.
    async fn broadcast_to_peers(&self, message: AgentMessage) {
        for peer in self.peer_channels.read().await.iter() {
            let _ = peer.send(message.clone()).await;
        }
    }
}

The run loop uses tokio::select! to react to either an inbound message or a periodic self-check timer — whichever fires first. On a Task, the agent calls process_autonomous_task, records the result, and broadcasts it to peers; on Shutdown it breaks the loop:

use tokio::time::{interval, Duration};

impl AutonomousAgent {
    async fn run(mut self) {
        let mut tick = interval(Duration::from_secs(10));
        loop {
            tokio::select! {
                Some(msg) = self.inbox.recv() => match msg {
                    AgentMessage::Shutdown => break,
                    AgentMessage::Task(task) => {
                        if let Ok(result) = self.process_autonomous_task(&task).await {
                            self.state.write().await.conversation_history.push(result.clone());
                            self.broadcast_to_peers(
                                AgentMessage::Response(self.id.clone(), result),
                            ).await;
                        }
                    }
                    AgentMessage::Trigger(msg) => {
                        let _ = self.process_autonomous_task(&msg).await;
                    }
                    AgentMessage::Response(from, content) => {
                        self.state.write().await
                            .conversation_history.push(format!("From {from}: {content}"));
                    }
                },
                // Autonomous periodic self-check (external trigger)
                _ = tick.tick() => {
                    // e.g. summarize progress, enqueue follow-up work, etc.
                }
            }
        }
    }
}

To wire up a swarm, create one mpsc::channel per agent, register each agent’s sender with its peers, tokio::spawn every run() future, then seed the system with a Task and finish with Shutdown messages — all ordinary Tokio orchestration.