The default agent is amnesiac: eachDocumentation Index
Fetch the complete documentation index at: https://cognis.vasanth.xyz/llms.txt
Use this file to discover all available pages before exploring further.
run starts fresh. For chat, you want continuity — the user’s name, the topic at hand, the last 10 turns. This pattern wires SummaryBufferMemory into the agent and stores conversation state in a graph checkpointer so a restart doesn’t lose the thread.
What you’ll build
A chat loop where each user message goes through the same agent, the agent remembers earlier turns, and the conversation state survives process restarts.How it works
stateful()keeps the agent’s memory acrossruncalls in a single process.SummaryBufferMemorytrims older turns into a running summary so the prompt stays bounded.- A
Checkpointerpersists the agent’s underlying graph state, keyed on athread_id. - Resuming is automatic — same
thread_idon the next request, the loop picks up where it left off.
In-process version
Persistent across restarts
To survive a process restart — common in production where the chat loop lives behind an HTTP server — persist the agent’s graph state in a checkpointer keyed by a session id.with_thread_id (set on the config) tells the checkpointer which conversation this is. Same thread id on the next request → state restored.
For multi-process deployments, swap SqliteCheckpointer for PostgresCheckpointer so several workers share the same store.
Picking a memory variant
Different shapes for different chat profiles:| Use case | Memory |
|---|---|
| Short FAQ-style chat | Window::new(20) |
| Long support sessions | SummaryBufferMemory::new(client, 2000) |
| Customer profile that should survive sessions | EntityMemory + a separate persistent store |
| Knowledge-graph-style memory across many sessions | KnowledgeGraphMemory |
| Combined recency + semantic recall | HybridMemory::new().with(Buffer::new()).with(VectorMemory::new(...)) |
How it works
- Memory and checkpoints are different layers. Memory shapes what the model sees on each turn; checkpoints persist the underlying graph state. You usually want both.
SummaryBufferMemorycalls the LLM to compress older turns. Budget for that — it’s a small extra cost per turn, paid only when the buffer overflows.thread_idis the unit of conversation. Different users → different ids. The same user across devices → same id (with whatever auth check fits your model).- Resume is exact. The checkpointer restores the same state, including pinned system messages and the running summary.
See also
Memory
The full memory variant catalog.
Checkpointing
Persisting graph state across processes.
Patterns → Streaming UI
Stream chat tokens to the frontend.