Skip to content
Get Started

Model routing

Model routing dynamically selects which agent should handle an incoming request. It lets you put several specialized agents behind a single chat interface — for example a coding expert, a maths expert, and a general assistant — and send each query to the agent best suited to answer it.

Routing solves several problems at once: it can act as a guardrail (only respond to certain topics), build multi-layered systems that hand off between models, and fall back to a cheaper or more expensive model depending on the request.

There are two common approaches. An LLM decision layer requires no extra infrastructure but is non-deterministic; semantic routing with embeddings is more reliable but needs a little more setup.

An LLM router uses a small, cheap model to classify the request, then dispatches to a specialized agent:

flowchart TD
User[User] -->|Query| Decision[LLM Decision Layer]
Decision -->|Category A| AgentA[Agent A]
Decision -->|Category B| AgentB[Agent B]
AgentB --> Response
AgentA --> Response

The router agent is instructed to return a single word from a fixed list. We then prompt the matching specialist agent (or return an error if nothing matched):

use rig::providers::openai;
use rig::completion::Prompt;
/// A minimal end-to-end LLM-based router in a single function.
/// In production you would abstract the routes behind the type system rather than
/// hard-coding them as below.
pub async fn llm_based_router() -> Result<(), Box<dyn std::error::Error>> {
let openai_client = openai::Client::from_env()?;
// Specialized agents ("routes")
let coding_agent = openai_client
.agent("gpt-5.5")
.preamble("You are an expert coding assistant specializing in Rust programming.")
.build();
let math_agent = openai_client
.agent("gpt-5.5")
.preamble("You are a mathematics expert who excels at solving complex problems.")
.build();
// Decision layer — a cheaper model is fine here, the work is trivial
let router = openai_client
.agent("gpt-5-mini")
.preamble(
"Return a single word from the allowed options, depending on which one the \
user's question is more closely related to. Skip all prose.\n\
Options: ['rust', 'maths']",
)
.build();
let prompt = "How do I use async with Rust?";
let topic = router.prompt(prompt).await?;
println!("Topic selected: {topic}");
let res = if topic.contains("math") {
math_agent.prompt(prompt).await?
} else if topic.contains("rust") {
coding_agent.prompt(prompt).await?
} else {
return Err(format!("No route found in text: {topic}").into());
};
println!("Response: {res}");
Ok(())
}

This works, but is fragile: the router can return unexpected text, and the routes are baked into one function. The next two sections make it more robust with typed routes and embeddings.

Instead of hard-coding routes, store them in a registry keyed by a string identifier. Because a single Agent<M> is parameterized by its completion model, a HashMap of agents constrains you to one provider — a reasonable trade for simpler code:

use std::collections::HashMap;
use rig::agent::Agent;
use rig::providers::openai::responses_api::ResponsesCompletionModel;
/// An agent backed by the OpenAI Responses API.
type OpenAIAgent = Agent<ResponsesCompletionModel>;
/// Holds any number of named `OpenAIAgent` routes.
struct TypedRouter {
routes: HashMap<String, OpenAIAgent>,
}
impl TypedRouter {
pub fn new() -> Self {
Self { routes: HashMap::new() }
}
pub fn add_route(mut self, name: &str, agent: OpenAIAgent) -> Self {
self.routes.insert(name.to_string(), agent);
self
}
pub fn fetch_agent(&self, route: &str) -> Option<&OpenAIAgent> {
self.routes.get(route)
}
}

This is ideal for non-complex use cases — for example routing simple classification queries to gpt-5-mini and full responses to gpt-5.5. If you need to mix providers in one router, use enum dispatch instead (see Dynamic model creation).

For more reliable matching, embed each route’s name, description, and examples, then find the closest route to an incoming query by vector similarity. This avoids the “wildly wrong word” failure mode of the LLM decision layer. For how embeddings and similarity search work under the hood, see Embeddings.

use rig::providers::openai;
use rig::vector_store::in_memory_store::InMemoryVectorStore;
use rig::vector_store::request::VectorSearchRequest;
use rig::vector_store::VectorStoreIndex;
use rig::embeddings::EmbeddingModel;
use rig::OneOrMany;
use serde::{Deserialize, Serialize};
/// A route definition. Name, description and examples are concatenated before
/// embedding to give the vector more meaning.
#[derive(Clone, Default, Serialize, Deserialize, Eq, PartialEq)]
struct RouteDefinition {
name: String,
description: String,
examples: Vec<String>,
}
/// Build an in-memory vector store of routes.
async fn create_semantic_router(
openai_client: &openai::Client,
) -> Result<InMemoryVectorStore<RouteDefinition>, Box<dyn std::error::Error>> {
let routes = vec![
RouteDefinition {
name: "rust".to_string(),
description: "Programming, code, and software development in the Rust programming language".to_string(),
examples: vec![
"How do I write an async function in Rust?".to_string(),
"Debug this code".to_string(),
"Implement a sorting algorithm in Rust".to_string(),
],
},
RouteDefinition {
name: "math".to_string(),
description: "Mathematics, calculations, and equations".to_string(),
examples: vec![
"Solve this equation".to_string(),
"Calculate the derivative".to_string(),
"What is 15% of 200?".to_string(),
],
},
];
let embedding_model = openai_client.embedding_model("text-embedding-3-small");
let mut vector_store = InMemoryVectorStore::default();
for route in routes {
let embedding_text = format!(
"{}: {}. Examples: {}",
route.name,
route.description,
route.examples.join(", ")
);
let embedding = embedding_model.embed_text(&embedding_text).await?;
vector_store.add_documents(vec![(route, OneOrMany::one(embedding))]);
}
Ok(vector_store)
}
/// Return the name of the route closest to `query`.
async fn semantic_route_query(
query: &str,
router: &InMemoryVectorStore<RouteDefinition>,
openai_client: &openai::Client,
) -> Result<String, Box<dyn std::error::Error>> {
let embedding_model = openai_client.embedding_model("text-embedding-3-small");
let index = router.clone().index(embedding_model);
let req = VectorSearchRequest::builder()
.query(query)
.samples(1)
.build();
let results = index.top_n::<RouteDefinition>(req).await?;
let route_name = results
.first()
.map(|(_, _, route_def)| route_def.name.as_str())
.unwrap_or("general");
Ok(route_name.to_string())
}

Note the default "general" fallback: always provide a route for queries that match nothing.

Combine the typed router (which owns the agents) with semantic routing (which picks the name):

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let openai_client = openai::Client::from_env()?;
let coding_agent = openai_client
.agent("gpt-5.5")
.preamble("You are an expert coding assistant specializing in Rust programming.")
.build();
let math_agent = openai_client
.agent("gpt-5.5")
.preamble("You are a mathematics expert who excels at solving complex problems.")
.build();
let router = TypedRouter::new()
.add_route("rust", coding_agent)
.add_route("math", math_agent);
let semantic_router = create_semantic_router(&openai_client).await?;
let prompt = "How do I use async with Rust?";
let route_name = semantic_route_query(prompt, &semantic_router, &openai_client).await?;
println!("Route selected: {route_name}");
let response = router
.fetch_agent(&route_name)
.ok_or("no agent for route")?
.prompt(prompt)
.await?;
println!("Response: {response}");
Ok(())
}