Structured Output
An Extractor turns unstructured text into a strongly-typed Rust value. You give it a target type, and Rig drives an LLM to parse text into that type with type-safe deserialization and almost no boilerplate — useful for pulling entities, fields, or records out of free-form input.
Minimal example
Section titled “Minimal example”Any target type must derive serde::Deserialize, serde::Serialize, and schemars::JsonSchema. Build an extractor for that type from a client, then call extract.
use rig::client::ProviderClient;use rig::providers::openai;
// Define the target structure#[derive(serde::Deserialize, serde::Serialize, rig::schemars::JsonSchema)]struct Person { name: Option<String>, age: Option<u8>, profession: Option<String>,}
#[tokio::main]async fn main() -> Result<(), Box<dyn std::error::Error>> { let openai = openai::Client::from_env()?; let extractor = openai.extractor::<Person>("gpt-5.5").build();
let person = extractor .extract("John Doe is a 30 year old doctor.") .await?;
println!( "{} is a {}", person.name.unwrap_or_default(), person.profession.unwrap_or_default() ); Ok(())}John Doe is a doctorHow it works
Section titled “How it works”Under the hood an extractor combines an Agent with a private “submit” tool whose arguments are your target type. Rig generates a JSON schema from your struct (via schemars), the model calls the submit tool with data matching that schema, and Rig deserializes the tool arguments back into your type. Because the schema is derived at compile time, you get compile-time type checking and automatic schema generation for free.
Adding context and instructions
Section titled “Adding context and instructions”The extractor builder lets you steer the model with a custom preamble and extra context before building:
let extractor = openai .extractor::<Person>("gpt-5.5") .preamble("Extract person details with high precision.") .context("Ages are given in years; ignore honorifics like 'Dr.'") .build();Error handling
Section titled “Error handling”extract returns an ExtractionError, which distinguishes the failure modes you’ll want to handle:
NoData— the model never called the submit tool, so nothing was extracted.DeserializationError— the submitted JSON didn’t match your type.PromptError— the underlying completion request failed.
use rig::extractor::ExtractionError;
match extractor.extract("...").await { Ok(person) => { /* use person */ } Err(ExtractionError::NoData) => { eprintln!("Model did not produce structured data"); } Err(err) => return Err(err.into()),}Batch processing
Section titled “Batch processing”Extractors are cheap to reuse across many inputs — build once, extract in a loop:
use rig::completion::CompletionModel;use rig::extractor::{Extractor, ExtractionError};
async fn process_documents<M: CompletionModel, T>( extractor: &Extractor<M, T>, docs: Vec<String>,) -> Vec<Result<T, ExtractionError>>where T: serde::de::DeserializeOwned + serde::Serialize + rig::schemars::JsonSchema + Send + Sync,{ let mut results = Vec::new(); for doc in docs { results.push(extractor.extract(&doc).await); } results}You can also feed extractors from document loaders, reading files and extracting structured records from each one:
use rig::loaders::FileLoader;
let docs = FileLoader::with_glob("*.txt")?.read().ignore_errors();let extractor = openai.extractor::<Person>("gpt-5.5").build();
for doc in docs { let structured = extractor.extract(&doc).await?; // process structured}Extractor vs. TypedPrompt
Section titled “Extractor vs. TypedPrompt”An Extractor wraps an agent and a submit tool specifically for parsing text into a type. If you already have an Agent and just want a single structured response, the TypedPrompt trait gives you the same typed-output behavior directly on the agent. Reach for extractors when structured extraction is the whole job; reach for TypedPrompt when structured output is one step in a broader agent workflow.
See also
Section titled “See also”- Completions — the
TypedPrompttrait for structured agent responses - Agents — the abstraction extractors are built on
- Tools — how the internal submit tool works
- Loaders — feed documents into an extractor
