Skip to content
Get Started

Loaders

Loaders read files from disk (or bytes) and turn them into text you can pass to an agent as context or embed into a vector store. They handle glob matching, directory traversal, and per-file errors so ingestion pipelines stay fault tolerant. Rig ships loaders for:

  • Any text file — FileLoader
  • PDFs — PdfFileLoader (behind the pdf feature)
  • ePub files — EpubFileLoader (behind the epub feature)

FileLoader handles generic text files. Point it at a glob, a directory, or raw bytes, then read the contents — ignore_errors() skips files that fail to load instead of aborting the whole batch:

use rig::loaders::FileLoader;
// Glob: load every Rust file in a directory, keeping each file's path.
let examples = FileLoader::with_glob("examples/*.rs")?
.read_with_path() // yields (PathBuf, String)
.ignore_errors()
.into_iter();
// Directory: load every file under a folder.
let dir_files = FileLoader::with_dir("data/")?
.read() // yields String
.ignore_errors();
// Bytes: load content from any source, e.g. a download.
let bytes: Vec<u8> = vec![1, 2, 3, 4];
let from_bytes = FileLoader::from_bytes(bytes);

Use read() when you only need the content and read_with_path() when you also want each file’s path (handy for labeling context).

A common use is folding loaded files into an agent’s context:

use rig::loaders::FileLoader;
use rig::agent::AgentBuilder;
let examples = FileLoader::with_glob("examples/*.rs")?
.read_with_path()
.ignore_errors()
.into_iter();
let agent = examples
.fold(AgentBuilder::new(model), |builder, (path, content)| {
builder.context(format!("Rust Example {path:?}:\n{content}").as_str())
})
.build();

PdfFileLoader mirrors the FileLoader API but adds PDF-specific handling such as page-by-page extraction. Use by_page() to iterate individual pages rather than whole documents:

use rig::loaders::PdfFileLoader;
let pages = PdfFileLoader::with_glob("docs/*.pdf")?
.load_with_path() // yields (PathBuf, Document)
.ignore_errors()
.by_page() // group each document's pages under its path
.ignore_errors() // yields (PathBuf, Vec<(usize, String)>)
.into_iter();
for (path, doc_pages) in pages {
for (page_no, text) in doc_pages {
// embed or index each page
}
}

PDF loading requires the pdf cargo feature. Use load() when you don’t need paths and load_with_path() when you do.