What an embedder does
Pick an embedder
- OpenAI
- Google
- Voyage
- Ollama (local)
cognis-rag/openai.Wrappers
You’ll usually wrap your real embedder once or twice:| Wrapper | Why |
|---|---|
CachedEmbeddings | Hash-keyed cache so re-indexing the same chunk is free. |
BatchedEmbeddings | Batch many calls into provider-friendly windows. Honors rate limits. |
EmbeddingsRouter (FnRouter, LengthRouter) | Route different docs to different embedders — e.g., short queries to a fast model, long passages to a quality model. |
FakeEmbeddings::new(dim) | Deterministic vectors for tests. No network. |
Pick a vector store
| Store | Hosted? | Feature | When to use |
|---|---|---|---|
InMemoryVectorStore | no | always | Tests, prototypes, ≤ 100k chunks. |
FaissVectorStore | local | vectorstore-faiss | Local production, no external service. |
ChromaVectorStore | hosted | vectorstore-chroma | Self-hosted Chroma server. |
QdrantVectorStore | hosted | vectorstore-qdrant | Production-grade, fast. |
PineconeVectorStore | hosted | vectorstore-pinecone | Managed cloud. |
WeaviateVectorStore | hosted | vectorstore-weaviate | When you also need symbolic / hybrid search. |
VectorStore:
Quick example — in-memory
SearchResult carries { id, text, score, metadata }. Score is provider-specific (cosine similarity for most).
Filtered search
Vector stores support metadata filters when their backend does:Filter is a small DSL — eq, ne, in, not_in, plus and/or combinators. The store translates it to its native query language.
How it works
- Embedders are stateless. They turn text into floats. Wrappers add memory, batching, routing.
- Vector stores own state. That’s why they take
&mut selffor mutations. Wrap withArc<RwLock<…>>to share between tasks (the indexing pipeline does this). - Embeddings are not interchangeable. A query embedded with model A can’t be searched against vectors stored with model B. Always re-index when you change embedders.
- Dimension mismatches are caught at runtime.
dimensions()lets you cross-check before persistence layouts go wrong.
See also
Retrievers
Turn a vector store into a query interface.
Indexing pipeline
Keep stores in sync with sources.
Reference → cognis-rag
Full method signatures and feature flags.