Skip to main content

Documentation Index

Fetch the complete documentation index at: https://cognis.vasanth.xyz/llms.txt

Use this file to discover all available pages before exploring further.

RAG starts with documents — the unit of retrieval. A Document is text plus metadata. Before you can embed and store them, long documents need to be split into chunks small enough for the embedder’s context window and small enough that retrieval surfaces relevant pieces, not whole files.

What a document is

use cognis_rag::Document;

let doc = Document::new("Cognis is a Rust-native LLM framework.")
    .with_id("intro")
    .with_metadata("source", "readme")
    .with_metadata("section", "intro");
FieldTypePurpose
contentStringThe text.
idOption<String>Stable identifier — set this for incremental indexing.
metadataHashMap<String, Value>Arbitrary k/v pairs you control. Filter retrieval by these.
Document loaders produce them; splitters transform them.

Loading documents

The DocumentLoader trait is one async method that returns a stream of Document. Built-in loaders are feature-gated:
SourceFeature flag
Plain text / files in a directoryalways available
CSVcognis-rag/csv-loader
HTMLcognis-rag/html-loader
YAMLcognis-rag/yaml-loader
TOMLcognis-rag/toml-loader
Web fetchcognis-rag/web-loader
PDFcognis-rag/pdf-loader
Implement DocumentLoader for sources that aren’t in the box (databases, APIs, S3 buckets):
use async_trait::async_trait;
use cognis_rag::loaders::{DocumentLoader, DocumentStream};
use cognis_rag::Document;
use futures::stream;

struct MyLoader;

#[async_trait]
impl DocumentLoader for MyLoader {
    async fn load(&self) -> cognis::Result<DocumentStream> {
        let docs = vec![
            Document::new("…").with_id("a"),
            Document::new("…").with_id("b"),
        ];
        Ok(Box::pin(stream::iter(docs.into_iter().map(Ok))))
    }
}

Splitters

All splitters implement TextSplitter:
pub trait TextSplitter: Send + Sync {
    fn split(&self, doc: &Document) -> Vec<Document>;
    fn split_all(&self, docs: &[Document]) -> Vec<Document> { /* default */ }
}
split returns chunks for one doc; split_all is a convenient wrapper for many.

Pick a splitter

SplitterWhen to use
RecursiveCharSplitterDefault. Tries paragraph → line → sentence → char. Good for prose.
CharacterSplitterSimple split on a single separator. Predictable; fast.
TokenAwareSplitterChunk by token count using a Tokenizer. Most accurate budgeting for embedders.
MarkdownSplitterHeader-aware Markdown chunking — preserves H1/H2/H3 structure.
SentenceSplitterSentence boundaries. Good for short-form text.
CodeSplitterLanguage-tuned separators (fn, class, etc.).
HtmlSplitterDOM-aware HTML chunking.
JsonSplitterStructured-data chunking.
use cognis_rag::{RecursiveCharSplitter, TextSplitter};

let splitter = RecursiveCharSplitter::new()
    .with_chunk_size(1000)
    .with_overlap(100);

let chunks = splitter.split_all(&docs);
Builder knobs vary per splitter — most accept chunk_size, overlap, and a separators list.

How chunks land in retrieval

Each chunk inherits its parent’s metadata, plus a position-in-parent marker. So when you retrieve a chunk, you also know:
  • Which document it came from (via metadata, especially if you set with_id).
  • Roughly where in that document it sits.
  • Any custom metadata you attached.
Retrievers can filter on metadata — see Retrievers.

Tuning chunk size

Two competing forces:
  • Smaller chunks = sharper retrieval (the right idea, not surrounding noise) but less context for the LLM to reason from.
  • Larger chunks = more context but worse signal-to-noise — the embedding averages over a lot of unrelated text.
Common starting points:
Use caseChunk sizeOverlap
Conversational FAQ200–500 chars50
Long-form prose800–1200 chars100–200
Source code500–1000 chars (line-aligned)50
Markdown docs1000–2000 chars (header-bounded)0
Validate with retrieval evals on your own corpus — there’s no universal right answer.

How it works

  • Splitting is lossless. No characters disappear; overlap means neighboring chunks share a window.
  • Document id is preserved through splits. Each chunk gets its own derived id (so you can de-duplicate later) but knows its parent.
  • Splitters don’t know about embedders. That decoupling means you can switch embedders without re-splitting.

See also

Embeddings and vector stores

Turn chunks into vectors, store, search.

Indexing pipeline

Keep your store in sync with the source of truth.

Patterns → Code Q&A

A worked end-to-end RAG over a Rust codebase.