Agents
An Agent is Rig’s primary building block for working with LLMs. It bundles a completion model together
with a system prompt, optional context documents, and a set of tools, then runs the agent loop for
you: it sends your prompt to the model, executes any tools the model calls, feeds the results back, and
repeats until the model produces an answer. Reach for an agent whenever you want to prompt a model
without writing that loop by hand — from a simple chatbot to a tool-using assistant or a RAG system.
A minimal agent
Section titled “A minimal agent”use rig::client::{CompletionClient, ProviderClient};use rig::completion::Prompt;use rig::providers::openai;
#[tokio::main]async fn main() -> Result<(), Box<dyn std::error::Error>> { let openai = openai::Client::from_env()?;
let agent = openai .agent("gpt-5.5") .preamble("You are a helpful assistant.") .temperature(0.7) .build();
let response = agent.prompt("Hello!").await?; println!("{response}");
Ok(())}Hello! How can I help you today?You build an agent with client.agent(model).preamble(...).build() and run it with
agent.prompt(...).await?. Everything else on this page is optional configuration layered on top.
What an agent is made of
Section titled “What an agent is made of”- Base configuration — the completion model, the system prompt (
preamble), and parameters liketemperatureandmax_tokens. - Context — documents appended to the request. Static context is always sent; dynamic context is retrieved from a vector store per request.
- Tools — capabilities the model can call. Static tools are always offered; dynamic tools are retrieved from a vector store per request.
- Conversation memory (optional) — a backend that loads prior history before each prompt and saves the new turn after it. See Conversations and memory.
How the agent loop works
Section titled “How the agent loop works”Everything an agent does is one loop. When you call .prompt(...), Rig:
- Builds a completion request from the preamble, static context, any dynamic context retrieved from a vector store, the conversation history, and the definitions of every tool on offer.
- Sends it to the model. One request/response round trip is a turn.
- Inspects the response.
- If the model answered with text, the loop ends and that text is your result.
- If the model requested tool calls, Rig executes each one — parsing the JSON arguments into
your
Argstype, running yourcallimplementation, and appending the result to the conversation as a tool-result message. A model may request several tool calls in a single turn; Rig runs them and returns all the results together.
- Repeats from step 2 with the updated history, so the model can use the tool results — until it produces a text answer or the turn budget runs out.
flowchart TD P["agent.prompt(...)"] --> M["Send request to model<br/>(preamble + context + history + tools)"] M --> D{"Model response"} D -- "text" --> F["Return the answer"] D -- "tool calls" --> T["Rig executes the tools,<br/>appends results to history"] T --> B{"Turn budget left?"} B -- "yes" --> M B -- "no" --> E["Err(MaxTurnsError)"]These docs call this the agent loop; you may also see “tool-calling loop” elsewhere — same thing.
Turns and max_turns
Section titled “Turns and max_turns”The turn budget is the agent’s guard against runaway loops: every turn costs latency and tokens, so
the loop refuses to run forever. By default (max_turns of 0) an agent can make its initial request
plus one follow-up after tool execution — enough for a prompt that triggers a single round of tool
calls. If the model keeps chaining tools past the budget, the prompt fails with
PromptError::MaxTurnsError, which carries the history accumulated so far so you can inspect what
happened.
Give a prompt more headroom with .max_turns(n):
let res = tool_agent .prompt("Please calculate 2 + 5, then multiply the result by 3") .max_turns(5) // allow several rounds of tool calls .await?;
println!("{res}");2 + 5 = 7, and 7 × 3 = 21.You can pass any limit — Rig imposes no upper bound — but pick a number that matches the longest tool
chain you actually expect, so a confused model fails fast instead of burning tokens. To set a default
for every prompt instead of per request, use AgentBuilder::default_max_turns(n). For handling the
error itself, see Error Handling.
When a tool call goes wrong
Section titled “When a tool call goes wrong”Two distinct failure modes, with different behavior:
- Your tool returns an error. The error’s string representation is sent back to the model as the tool result, and the loop continues — the model sees what went wrong and can retry with different arguments or explain the failure. Make your tool errors descriptive; the model reads them.
- The model emits an invalid call — a tool name that doesn’t exist, or one that isn’t allowed.
By default this fails the prompt immediately. To recover instead, implement
on_invalid_tool_callon a prompt hook and return aRetry(re-ask the model with corrective feedback),Repair(fix the tool name), orSkipaction;.max_invalid_tool_call_retries(n)on the prompt request bounds how manyRetryrounds are allowed, and each retry consumes turn budget.
To observe or veto individual tool calls yourself — for logging, approval flows, or guardrails — use a prompt hook.
Context
Section titled “Context”Agents can pull context from more than one source and append it to every provider request automatically:
- Static context (
AgentBuilder::context()) is appended to every request — good for a small, fixed set of reference material. - Dynamic context (
AgentBuilder::dynamic_context()) fetches up to a set number of results from a vector store and adds them to the request — the basis of RAG.
let store = InMemoryVectorStore::from_documents(embeddings);let index = store.index(embedding_model);
let agent = openai .agent("gpt-5.5") .preamble("You are a knowledge base assistant.") .dynamic_context(3, index) // retrieve the 3 most relevant documents per request .temperature(0.3) .build();See Vector Stores & RAG for how the retrieval side works.
Agents resolve tool calls, execute the tools, and feed results back to the model automatically, as described in the agent loop. There are a few tool-related concepts to know:
ToolSet— the repository where tools live: a map of tool names to their implementations.- static tools — always presented to the model in the tool-definition list.
- dynamic tools — retrieved from a vector store at prompt time (via
AgentBuilder::dynamic_tools), so only the relevant tools are offered on a given request.
let agent = openai .agent("gpt-5.5") .preamble("You are a tool-using assistant.") .tool(calculator) // static tool .tool(web_search) // static tool .dynamic_tools(2, tool_store, toolset) // dynamic tools .temperature(0.5) .build();For how to write tools and wire up dynamic retrieval, see Tools.
Steering tool use with ToolChoice
Section titled “Steering tool use with ToolChoice”By default the model decides freely whether to call a tool or answer directly. When you need to
constrain that, set a ToolChoice on the builder:
use rig::message::ToolChoice;
let agent = openai .agent("gpt-5.5") .preamble("You are a calculator. Always compute with tools, never in your head.") .tool_choice(ToolChoice::Required) // the model must call a tool before answering .build();ToolChoice::Auto— the default: the model may call tools or answer directly.ToolChoice::None— tools are visible but must not be called.ToolChoice::Required— the model must call at least one tool.ToolChoice::Specific { function_names }— the model must call one of the named tools.
Required and Specific are useful for forcing a deterministic first step (always search before
answering, always classify first); None is useful when you want the model to reason about tools
without executing them.
Agents as tools (manager-worker)
Section titled “Agents as tools (manager-worker)”Because an agent implements the tool interface, you can hand one agent to another as a tool. This is the building block for the manager-worker pattern: a coordinating “manager” agent delegates subtasks to specialised “worker” agents by calling them like any other tool.
Give each worker a name and description — both are used when the agent is exposed as a tool — then
attach it to the manager with .tool(...):
use rig::client::{CompletionClient, ProviderClient};use rig::completion::Prompt;use rig::providers::openai;
let openai = openai::Client::from_env()?;
// The worker: a specialised sub-agent.let bob = openai .agent("gpt-5.5") .name("Bob") .description("An employee who handles admin tasks at FooBar Inc.") .preamble("You are Bob, an admin employee. Your manager Alice may ask you to do things.") .build();
// The manager: prompts Bob by calling it as a tool.let alice = openai .agent("gpt-5.5") .name("Alice") .description("A manager at FooBar Inc.") .preamble("You are Alice, a manager in the admin department. You manage Bob.") .tool(bob) .build();
let res = alice .prompt("Ask Bob to draft a welcome email and tell me what he wrote.") .max_turns(5) .await?;
println!("{res}");I asked Bob to draft the welcome email. Here's what he wrote:
Subject: Welcome to FooBar Inc.!
Hi there, welcome aboard! We're thrilled to have you join the team. Yourfirst-day details are attached — reach out if you need anything.
Best,BobUnder the hood the manager returns a tool call targeting the worker; Rig prompts the worker, returns its answer to the manager, and the manager folds it into the final response. For swarm-style and actor-based architectures, see Multi-agent systems.
Conversations and memory
Section titled “Conversations and memory”An agent is stateless: each .prompt(...) starts from scratch unless you supply the earlier turns.
There are three ways to do that, from most manual to most automatic:
Pass history explicitly with .with_history(...) on the prompt request. You own the
Vec<Message> and must record each new turn yourself after the call returns — Rig does not append
to it:
let reply = agent.prompt("What's my name?").with_history(history.iter()).await?;
history.push(Message::user("What's my name?"));history.push(Message::assistant(&reply));Use chat for the common text-in/text-out case. It takes the history as a mutable reference and
appends the new turn for you — the user message, any tool calls and results, and the assistant’s
reply — so don’t push them again yourself:
let chat_agent = openai .agent("gpt-5.5") .preamble("You are a conversational assistant.") .build();
let response = chat_agent.chat("Hello!", &mut previous_messages).await?;Attach conversation memory to make history handling automatic. With a memory backend and a conversation id, Rig loads the stored history before each prompt and appends the new turn (including any tool calls and results) after it:
use rig::memory::InMemoryConversationMemory;
let agent = openai .agent("gpt-5.5") .preamble("You are a helpful assistant.") .memory(InMemoryConversationMemory::new()) .build();
// Each conversation id keeps its own history across prompts.let _ = agent.prompt("My name is Ada.").conversation("user-42").await?;let reply = agent.prompt("What's my name?").conversation("user-42").await?;Memory owns the full picture — the bypass rules (explicit history skips memory for that request), durable backends, bounding history growth, and long-term memory. For a ready-made conversational REPL, see Build a CLI chatbot.
Token usage & run details
Section titled “Token usage & run details”.prompt(...) returns just the answer text. When you need to know what a run cost — for budgets,
dashboards, or per-user accounting — add .extended_details() to get a PromptResponse instead:
let response = agent .prompt("What is 2 + 2?") .max_turns(3) .extended_details() .await?;
println!("answer: {}", response.output);println!( "tokens: {} in / {} out across {} model requests", response.usage.input_tokens, response.usage.output_tokens, response.requests(),);answer: 2 + 2 = 4.tokens: 28 in / 12 out across 1 model requestsusage aggregates token counts across every turn of the loop; completion_calls breaks them down
per model request (the last entry tells you how large the final request’s context was); messages
holds the full message history the run produced. Zero-valued usage means the provider didn’t report
metrics. The same numbers are also recorded on tracing spans — see
Observability.
Additional parameters
Section titled “Additional parameters”Some providers accept parameters Rig doesn’t model directly (reasoning settings, provider-specific knobs).
Pass them through with AgentBuilder::additional_params() and they’ll be merged into the completion
request:
let agent = openai_client .agent("gpt-5.5") .preamble("You are a helpful agent.") .additional_params(serde_json::json!({ "foo": "bar" })) .build();Prompt hooks
Section titled “Prompt hooks”Prompt hooks let you observe and steer the agent loop from your own code. The
PromptHook trait provides a default
implementation for every method, so you only implement the ones you need.
- Observe request prompts, completion responses, tool calls, and tool responses — the raw material for logging, metrics, or audit trails.
- Steer the loop through each method’s return value: return
ToolCallHookAction::Skip { reason }fromon_tool_callto block a tool call (the model receives your reason as the tool result), or aTerminate { reason }action from any hook to cancel the whole run, which surfaces as aPromptError::PromptCancellederror carrying the internal chat history. This is the primitive for approval flows and guardrails: inspect a proposed tool call and refuse it before it executes. - Recover from invalid tool calls in
on_invalid_tool_call— retry with feedback, repair the tool name, or skip — instead of the fail-fast default.
Hooks are awaited inline at each step of the loop, so keep them lightweight — a slow hook delays the agent’s next turn. Offload heavy work (network writes, disk I/O) to a background task.
Agent or hand-written workflow?
Section titled “Agent or hand-written workflow?”An agent lets the model decide the control flow — the right tool for open-ended tasks where the steps can’t be predicted. When you already know the steps, plain Rust calling agents in sequence is simpler, cheaper, and deterministic: see Agent or workflow? on the Workflows page for how to choose.
See also
Section titled “See also”- Completions — the model layer beneath agents
- Tools — extend an agent with callable functions
- Memory — conversation history, compaction, and long-term memory
- Streaming — stream an agent’s responses token by token
- Error Handling — handle
MaxTurnsErrorand transient failures - Build a RAG system — a full retrieval-augmented agent
