You have a codebase. You want a chat assistant that can answer questions about it — “where do we authenticate users?”, “show me howDocumentation Index
Fetch the complete documentation index at: https://cognis.vasanth.xyz/llms.txt
Use this file to discover all available pages before exploring further.
Foo is constructed in tests” — with file-and-line citations. This pattern walks the full path: index the repo, retrieve, answer.
What you’ll build
A binary that takes a question, retrieves the top-K relevant code chunks, hands them to the model with the question, and prints an answer withpath:line citations.
How it works
- Walk the repo with a custom
DocumentLoader. Every file becomes aDocument; the path goes in metadata. - Split with
CodeSplitter— language-tuned separators keep functions and types together. - Embed and store —
OpenAIEmbeddings(orOllamaEmbeddingsfor local) into anInMemoryVectorStore. UseIndexingPipelineif you’ll re-index regularly. - Retrieve with a
VectorRetriever, optionally re-rank with a cross-encoder. - Answer by stuffing the retrieved chunks into a prompt with the question.
Step 1 — Walk the repo
walkdir is a small dep most repos already have; swap with git2 to honor .gitignore.
Step 2 — Index
Step 3 — Retrieve and answer
How it works
- Path goes in metadata, not content. Keeping the file path as a metadata field means it survives splitting — every chunk knows where it came from.
CodeSplitterrespects function and type boundaries. Better than character splitting because it keeps coherent units together; the embedder produces sharper signals.InMemoryRecordManagerfingerprints by file id. Edit a file, rerun the pipeline, only that file re-embeds.- The prompt asks for citations explicitly. Models that aren’t told to cite usually don’t. Inline
[N]references map back to the file paths you printed alongside the chunks.
Make it better
| Improvement | What to add |
|---|---|
| Hybrid retrieval | Combine VectorRetriever with BM25Retriever via EnsembleRetriever — exact-name matches improve a lot. |
| Reranking | Add a CrossEncoderReranker to trim the top-K to the truly relevant. |
| Persistent store | Swap InMemoryVectorStore for FAISS (vectorstore-faiss feature) so an index survives restarts. |
| Persistent record manager | Implement RecordManager over SQLite so incremental works across processes. |
| Streaming | client.stream(messages) and print tokens as they arrive — see Streaming. |
| Eval | Build a small set of known questions with expected source files; score retrieval recall@K. |
See also
Documents and splitters
Why
CodeSplitter matters.Indexing pipeline
Incremental updates as the repo changes.
Reranking and compression
Sharpen retrieval beyond the top-K.