Loaders

Loaders read files from disk (or bytes) and turn them into text you can pass to an agent as context or embed into a vector store. They handle glob matching, directory traversal, and per-file errors so ingestion pipelines stay fault tolerant. Rig ships loaders for:

Any text file — FileLoader
PDFs — PdfFileLoader (behind the pdf feature)
ePub files — EpubFileLoader (behind the epub feature)

FileLoader

FileLoader handles generic text files. Point it at a glob, a directory, or raw bytes, then read the contents — ignore_errors() skips files that fail to load instead of aborting the whole batch:

use rig::loaders::FileLoader;

// Glob: load every Rust file in a directory, keeping each file's path.
let examples = FileLoader::with_glob("examples/*.rs")?
    .read_with_path()   // yields (PathBuf, String)
    .ignore_errors()
    .into_iter();

// Directory: load every file under a folder.
let dir_files = FileLoader::with_dir("data/")?
    .read()             // yields String
    .ignore_errors();

// Bytes: load content from any source, e.g. a download.
let bytes: Vec<u8> = vec![1, 2, 3, 4];
let from_bytes = FileLoader::from_bytes(bytes);

Use read() when you only need the content and read_with_path() when you also want each file’s path (handy for labeling context).

Loading agent context

A common use is folding loaded files into an agent’s context:

use rig::loaders::FileLoader;
use rig::agent::AgentBuilder;

let examples = FileLoader::with_glob("examples/*.rs")?
    .read_with_path()
    .ignore_errors()
    .into_iter();

let agent = examples
    .fold(AgentBuilder::new(model), |builder, (path, content)| {
        builder.context(format!("Rust Example {path:?}:\n{content}").as_str())
    })
    .build();

PdfFileLoader

PdfFileLoader mirrors the FileLoader API but adds PDF-specific handling such as page-by-page extraction. Use by_page() to iterate individual pages rather than whole documents:

use rig::loaders::PdfFileLoader;

let pages = PdfFileLoader::with_glob("docs/*.pdf")?
    .load_with_path()   // yields (PathBuf, Document)
    .ignore_errors()
    .by_page()          // group each document's pages under its path
    .ignore_errors()    // yields (PathBuf, Vec<(usize, String)>)
    .into_iter();

for (path, doc_pages) in pages {
    for (page_no, text) in doc_pages {
        // embed or index each page
    }
}

PDF loading requires the pdf cargo feature. Use load() when you don’t need paths and load_with_path() when you do.

Next steps

Vector Stores & RAGEmbed and index the documents you just loaded, then retrieve them at query time.

EmbeddingsTurn loaded file content into vectors before indexing it.

Structured OutputExtract typed data from the raw text your loaders produce.

AgentsFold loaded files into an agent's context, as shown above.

Previous
Workflows Next
Media