Agents

An Agent is Rig’s primary building block for working with LLMs. It bundles a completion model together with a system prompt, optional context documents, and a set of tools, then runs the agent loop for you: it sends your prompt to the model, executes any tools the model calls, feeds the results back, and repeats until the model produces an answer. Reach for an agent whenever you want to prompt a model without writing that loop by hand — from a simple chatbot to a tool-using assistant or a RAG system.

A minimal agent

use rig::client::{CompletionClient, ProviderClient};
use rig::completion::Prompt;
use rig::providers::openai;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let openai = openai::Client::from_env()?;

    let agent = openai
        .agent("gpt-5.5")
        .preamble("You are a helpful assistant.")
        .temperature(0.7)
        .build();

    let response = agent.prompt("Hello!").await?;
    println!("{response}");

    Ok(())
}

Hello! How can I help you today?

You build an agent with client.agent(model).preamble(...).build() and run it with agent.prompt(...).await?. Everything else on this page is optional configuration layered on top.

What an agent is made of

Base configuration — the completion model, the system prompt (preamble), and parameters like temperature and max_tokens.
Context — documents appended to the request. Static context is always sent; dynamic context is retrieved from a vector store per request.
Tools — capabilities the model can call. Static tools are always offered; dynamic tools are retrieved from a vector store per request.
Conversation memory (optional) — a backend that loads prior history before each prompt and saves the new turn after it. See Conversations and memory.

How the agent loop works

Everything an agent does is one loop. When you call .prompt(...), Rig:

Builds a completion request from the preamble, static context, any dynamic context retrieved from a vector store, the conversation history, and the definitions of every tool on offer.
Sends it to the model. One request/response round trip is a turn.
Inspects the response.
- If the model answered with text, the loop ends and that text is your result.
- If the model requested tool calls, Rig executes each one — parsing the JSON arguments into your Args type, running your call implementation, and appending the result to the conversation as a tool-result message. A model may request several tool calls in a single turn; Rig runs them and returns all the results together.
Repeats from step 2 with the updated history, so the model can use the tool results — until it produces a text answer or the turn budget runs out.

flowchart TD
    P["agent.prompt(...)"] --> M["Send request to model<br/>(preamble + context + history + tools)"]
    M --> D{"Model response"}
    D -- "text" --> F["Return the answer"]
    D -- "tool calls" --> T["Rig executes the tools,<br/>appends results to history"]
    T --> B{"Turn budget left?"}
    B -- "yes" --> M
    B -- "no" --> E["Err(MaxTurnsError)"]

These docs call this the agent loop; you may also see “tool-calling loop” elsewhere — same thing.

Turns and `max_turns`

The turn budget is the agent’s guard against runaway loops: every turn costs latency and tokens, so the loop refuses to run forever. By default (max_turns of 0) an agent can make its initial request plus one follow-up after tool execution — enough for a prompt that triggers a single round of tool calls. If the model keeps chaining tools past the budget, the prompt fails with PromptError::MaxTurnsError, which carries the history accumulated so far so you can inspect what happened.

Give a prompt more headroom with .max_turns(n):

let res = tool_agent
    .prompt("Please calculate 2 + 5, then multiply the result by 3")
    .max_turns(5) // allow several rounds of tool calls
    .await?;

println!("{res}");

2 + 5 = 7, and 7 × 3 = 21.

You can pass any limit — Rig imposes no upper bound — but pick a number that matches the longest tool chain you actually expect, so a confused model fails fast instead of burning tokens. To set a default for every prompt instead of per request, use AgentBuilder::default_max_turns(n). For handling the error itself, see Error Handling.

When a tool call goes wrong

Two distinct failure modes, with different behavior:

Your tool returns an error. The error’s string representation is sent back to the model as the tool result, and the loop continues — the model sees what went wrong and can retry with different arguments or explain the failure. Make your tool errors descriptive; the model reads them.
The model emits an invalid call — a tool name that doesn’t exist, or one that isn’t allowed. By default this fails the prompt immediately. To recover instead, implement on_invalid_tool_call on a prompt hook and return a Retry (re-ask the model with corrective feedback), Repair (fix the tool name), or Skip action; .max_invalid_tool_call_retries(n) on the prompt request bounds how many Retry rounds are allowed, and each retry consumes turn budget.

To observe or veto individual tool calls yourself — for logging, approval flows, or guardrails — use a prompt hook.

Context

Agents can pull context from more than one source and append it to every provider request automatically:

Static context (AgentBuilder::context()) is appended to every request — good for a small, fixed set of reference material.
Dynamic context (AgentBuilder::dynamic_context()) fetches up to a set number of results from a vector store and adds them to the request — the basis of RAG.

let store = InMemoryVectorStore::from_documents(embeddings);
let index = store.index(embedding_model);

let agent = openai
    .agent("gpt-5.5")
    .preamble("You are a knowledge base assistant.")
    .dynamic_context(3, index) // retrieve the 3 most relevant documents per request
    .temperature(0.3)
    .build();

See Vector Stores & RAG for how the retrieval side works.

Tools

Agents resolve tool calls, execute the tools, and feed results back to the model automatically, as described in the agent loop. There are a few tool-related concepts to know:

ToolSet — the repository where tools live: a map of tool names to their implementations.
static tools — always presented to the model in the tool-definition list.
dynamic tools — retrieved from a vector store at prompt time (via AgentBuilder::dynamic_tools), so only the relevant tools are offered on a given request.

let agent = openai
    .agent("gpt-5.5")
    .preamble("You are a tool-using assistant.")
    .tool(calculator)                       // static tool
    .tool(web_search)                       // static tool
    .dynamic_tools(2, tool_store, toolset)  // dynamic tools
    .temperature(0.5)
    .build();

For how to write tools and wire up dynamic retrieval, see Tools.

Steering tool use with `ToolChoice`

By default the model decides freely whether to call a tool or answer directly. When you need to constrain that, set a ToolChoice on the builder:

use rig::message::ToolChoice;

let agent = openai
    .agent("gpt-5.5")
    .preamble("You are a calculator. Always compute with tools, never in your head.")
    .tool_choice(ToolChoice::Required) // the model must call a tool before answering
    .build();

ToolChoice::Auto — the default: the model may call tools or answer directly.
ToolChoice::None — tools are visible but must not be called.
ToolChoice::Required — the model must call at least one tool.
ToolChoice::Specific { function_names } — the model must call one of the named tools.

Required and Specific are useful for forcing a deterministic first step (always search before answering, always classify first); None is useful when you want the model to reason about tools without executing them.

Agents as tools (manager-worker)

Because an agent implements the tool interface, you can hand one agent to another as a tool. This is the building block for the manager-worker pattern: a coordinating “manager” agent delegates subtasks to specialised “worker” agents by calling them like any other tool.

Give each worker a name and description — both are used when the agent is exposed as a tool — then attach it to the manager with .tool(...):

use rig::client::{CompletionClient, ProviderClient};
use rig::completion::Prompt;
use rig::providers::openai;

let openai = openai::Client::from_env()?;

// The worker: a specialised sub-agent.
let bob = openai
    .agent("gpt-5.5")
    .name("Bob")
    .description("An employee who handles admin tasks at FooBar Inc.")
    .preamble("You are Bob, an admin employee. Your manager Alice may ask you to do things.")
    .build();

// The manager: prompts Bob by calling it as a tool.
let alice = openai
    .agent("gpt-5.5")
    .name("Alice")
    .description("A manager at FooBar Inc.")
    .preamble("You are Alice, a manager in the admin department. You manage Bob.")
    .tool(bob)
    .build();

let res = alice
    .prompt("Ask Bob to draft a welcome email and tell me what he wrote.")
    .max_turns(5)
    .await?;

println!("{res}");

I asked Bob to draft the welcome email. Here's what he wrote:

Subject: Welcome to FooBar Inc.!

Hi there, welcome aboard! We're thrilled to have you join the team. Your
first-day details are attached — reach out if you need anything.

Best,
Bob

Under the hood the manager returns a tool call targeting the worker; Rig prompts the worker, returns its answer to the manager, and the manager folds it into the final response. For swarm-style and actor-based architectures, see Multi-agent systems.

Conversations and memory

An agent is stateless: each .prompt(...) starts from scratch unless you supply the earlier turns. There are three ways to do that, from most manual to most automatic:

Pass history explicitly with .with_history(...) on the prompt request. You own the Vec<Message> and must record each new turn yourself after the call returns — Rig does not append to it:

let reply = agent.prompt("What's my name?").with_history(history.iter()).await?;

history.push(Message::user("What's my name?"));
history.push(Message::assistant(&reply));

Use chat for the common text-in/text-out case. It takes the history as a mutable reference and appends the new turn for you — the user message, any tool calls and results, and the assistant’s reply — so don’t push them again yourself:

let chat_agent = openai
    .agent("gpt-5.5")
    .preamble("You are a conversational assistant.")
    .build();

let response = chat_agent.chat("Hello!", &mut previous_messages).await?;

Attach conversation memory to make history handling automatic. With a memory backend and a conversation id, Rig loads the stored history before each prompt and appends the new turn (including any tool calls and results) after it:

use rig::memory::InMemoryConversationMemory;

let agent = openai
    .agent("gpt-5.5")
    .preamble("You are a helpful assistant.")
    .memory(InMemoryConversationMemory::new())
    .build();

// Each conversation id keeps its own history across prompts.
let _ = agent.prompt("My name is Ada.").conversation("user-42").await?;
let reply = agent.prompt("What's my name?").conversation("user-42").await?;

Memory owns the full picture — the bypass rules (explicit history skips memory for that request), durable backends, bounding history growth, and long-term memory. For a ready-made conversational REPL, see Build a CLI chatbot.

Token usage & run details

.prompt(...) returns just the answer text. When you need to know what a run cost — for budgets, dashboards, or per-user accounting — add .extended_details() to get a PromptResponse instead:

let response = agent
    .prompt("What is 2 + 2?")
    .max_turns(3)
    .extended_details()
    .await?;

println!("answer: {}", response.output);
println!(
    "tokens: {} in / {} out across {} model requests",
    response.usage.input_tokens,
    response.usage.output_tokens,
    response.requests(),
);

answer: 2 + 2 = 4.
tokens: 28 in / 12 out across 1 model requests

usage aggregates token counts across every turn of the loop; completion_calls breaks them down per model request (the last entry tells you how large the final request’s context was); messages holds the full message history the run produced. Zero-valued usage means the provider didn’t report metrics. The same numbers are also recorded on tracing spans — see Observability.

Additional parameters

Some providers accept parameters Rig doesn’t model directly (reasoning settings, provider-specific knobs). Pass them through with AgentBuilder::additional_params() and they’ll be merged into the completion request:

let agent = openai_client
    .agent("gpt-5.5")
    .preamble("You are a helpful agent.")
    .additional_params(serde_json::json!({
        "foo": "bar"
    }))
    .build();

Prompt hooks

Prompt hooks let you observe and steer the agent loop from your own code. The PromptHook trait provides a default implementation for every method, so you only implement the ones you need.

Observe request prompts, completion responses, tool calls, and tool responses — the raw material for logging, metrics, or audit trails.
Steer the loop through each method’s return value: return ToolCallHookAction::Skip { reason } from on_tool_call to block a tool call (the model receives your reason as the tool result), or a Terminate { reason } action from any hook to cancel the whole run, which surfaces as a PromptError::PromptCancelled error carrying the internal chat history. This is the primitive for approval flows and guardrails: inspect a proposed tool call and refuse it before it executes.
Recover from invalid tool calls in on_invalid_tool_call — retry with feedback, repair the tool name, or skip — instead of the fail-fast default.

Hooks are awaited inline at each step of the loop, so keep them lightweight — a slow hook delays the agent’s next turn. Offload heavy work (network writes, disk I/O) to a background task.

Agent or hand-written workflow?

An agent lets the model decide the control flow — the right tool for open-ended tasks where the steps can’t be predicted. When you already know the steps, plain Rust calling agents in sequence is simpler, cheaper, and deterministic: see Agent or workflow? on the Workflows page for how to choose.

Next steps

ToolsWrite callable functions and wire up static and dynamic tool retrieval for your agent.

MemoryBound conversation history with policies and persist it across sessions.

StreamingStream an agent's responses token by token instead of waiting for the full reply.

Vector Stores & RAGBack dynamic_context with a vector store to build a retrieval-augmented agent.

Previous
Completions Next
Tools