Skip to main content

Documentation Index

Fetch the complete documentation index at: https://cognis.vasanth.xyz/llms.txt

Use this file to discover all available pages before exploring further.

cognis-rag is the building-block crate for retrieval-augmented generation. Eight splitters, four embedders, six vector stores, eight retrievers, plus the indexing pipeline that ties them together.

Crate metadata

FieldValue
Latest version0.3
docs.rsdocs.rs/cognis-rag
Repo pathcrates/cognis-rag
Default featuresopenai, ollama

Modules at a glance

ModuleWhat
documentDocument, with_id, with_metadata.
splittersTextSplitter trait + 8 impls.
embeddingsEmbeddings trait + 8 impls (real, fake, cached, batched, router).
vectorstoreVectorStore trait + 6 impls.
retrievers8 retriever impls implementing Runnable<String, Vec<Document>>.
loadersDocumentLoader trait + format-specific loaders.
indexingIndexingPipeline, IncrementalReport.
record_managerRecordManager trait + InMemoryRecordManager.
transformersLongContextReorder, MetadataTransformer.
cross_encoderCrossEncoder, CrossEncoderReranker, FnCrossEncoder.
docstoreDocstore, InMemoryDocstore for parent-document patterns.
multi_vectorMultiVectorIndexer for summary+chunk indexing.
distanceDistance::{Cosine, Euclidean, DotProduct}.
filterFilter::{eq, ne, in_, not_in, and, or}.

Splitters

SplitterConstructor
RecursiveCharSplitter::new().with_chunk_size(n).with_overlap(n).with_separators(...)
CharacterSplitter::new().with_chunk_size(n)
TokenAwareSplitter::new(tokenizer).with_chunk_tokens(n)
MarkdownSplitter::new()
SentenceSplitter::new()
CodeSplitter::new(Language::Rust)
HtmlSplitter::new()
JsonSplitter::new()
All implement TextSplitter: split(&Document) -> Vec<Document>, split_all(&[Document]) -> Vec<Document>.

Embeddings

TypeFeatureConstructor
OpenAIEmbeddingsopenai::new(api_key) or ::builder()
GoogleEmbeddingsgoogle::new(api_key) or ::builder()
OllamaEmbeddingsollama::new(model_name)
VoyageEmbeddingsvoyage::new(api_key) or ::builder()
FakeEmbeddingsalways::new(dimensions)
CachedEmbeddingsalways::new(inner)
BatchedEmbeddingsalways::new(inner).with_batch_size(n)
EmbeddingsRouter (FnRouter, LengthRouter)always::new() builder
All implement Embeddings: embed_documents, embed_query, dimensions, model.

Vector stores

StoreFeatureNotes
InMemoryVectorStorealways::new(emb), ::with_distance(emb, dist)
FaissVectorStorevectorstore-faissLocal, on-disk
ChromaVectorStore (ChromaBuilder)vectorstore-chromaSelf-hosted Chroma
QdrantVectorStore (QdrantBuilder)vectorstore-qdrantProduction-grade
PineconeVectorStore (PineconeBuilder)vectorstore-pineconeManaged cloud
WeaviateVectorStore (WeaviateBuilder)vectorstore-weaviateHybrid search
All implement VectorStore:
pub trait VectorStore: Send + Sync {
    async fn add_texts(&mut self, texts: Vec<String>, metadata: Option<Vec<HashMap<String, Value>>>) -> Result<Vec<String>>;
    async fn add_vectors(&mut self, vectors: Vec<Vec<f32>>, texts: Vec<String>, metadata: Option<Vec<HashMap<String, Value>>>) -> Result<Vec<String>>;
    async fn similarity_search(&self, query: &str, k: usize) -> Result<Vec<SearchResult>>;
    async fn similarity_search_by_vector(&self, query_vector: Vec<f32>, k: usize) -> Result<Vec<SearchResult>>;
    async fn similarity_search_with_filter(&self, query: &str, k: usize, filter: &Filter) -> Result<Vec<SearchResult>>;
    async fn delete(&mut self, ids: Vec<String>) -> Result<()>;
    fn len(&self) -> usize;
    fn is_empty(&self) -> bool { self.len() == 0 }
}

Retrievers

RetrieverNotes
VectorRetriever::new(store).with_top_k(n).with_filter(filter)
BM25Retriever::from_documents(docs).with_top_k(n)
EnsembleRetriever::new().add(retriever, weight)
MultiVectorRetrieverSummary embedding routes to chunk retrieval
ParentDocumentRetrieverSharp chunk match → enclosing parent
QueryTranslatorRetrieverLLM rewrites the query
CompressorPipelineChain of transformers
CachingRetrieverHash-keyed wrapper

IndexingPipeline

pub struct IndexingPipeline<L, T> { /* … */ }

impl<L: DocumentLoader, T: TextSplitter> IndexingPipeline<L, T> {
    pub fn new(loader: L, splitter: T, store: Arc<RwLock<dyn VectorStore>>) -> Self;
    pub async fn run(&self) -> Result<usize>;
    pub async fn run_incremental(
        &self,
        record_manager: &dyn RecordManager,
        group: &str,
        key_fn: impl Fn(&Document) -> Option<String>,
    ) -> Result<IncrementalReport>;
}
IncrementalReport { added, changed, unchanged, deleted }.

Feature flags

FeaturePulls in
openaiOpenAI embeddings (default).
googleGoogle embeddings.
voyageVoyage embeddings.
ollamaOllama embeddings (default).
csv-loader, html-loader, yaml-loader, toml-loader, web-loader, pdf-loaderFormat-specific loaders.
all-loadersAll loaders.
vectorstore-faissFAISS local store.
vectorstore-chroma, -qdrant, -pinecone, -weaviateHosted stores.
all-vectorstoresAll vector stores.

See also

Documents and splitters

User guide for splitters.

Embeddings

Embedders and stores.

Retrievers

Retrieval shapes.

Indexing

Incremental updates.