A checkpointer turns a graph from a one-shot computation into something you can pause, inspect, edit, and resume. It’s also the foundation for human-in-the-loop (which needsDocumentation Index
Fetch the complete documentation index at: https://cognis.vasanth.xyz/llms.txt
Use this file to discover all available pages before exploring further.
resume) and for production durability (you survive process restarts).
What a checkpointer is
| Checkpointer | Backed by | Feature flag |
|---|---|---|
InMemoryCheckpointer | a process-local map | always |
SqliteCheckpointer | a SQLite file | cognis-graph/sqlite |
PostgresCheckpointer | a Postgres database | cognis-graph/postgres |
Quick example
examples/v2/05_checkpoint_resume.rs.
Inspecting state
Compiled graphs expose the inspection surface directly:Editing state
Sometimes the human in the loop should fix the state before resuming — correct a typo, drop a tool result, change a counter.update_state writes a new snapshot at a given step:
resume(run_id, step, state, cfg) reads from this updated state, so the rewind is real.
Resume after an interrupt
When a graph pauses (because ofwith_interrupt_before / with_interrupt_after), invoke returns Err(CognisError::GraphInterrupted { kind, step, .. }). That’s not a failure — it’s a pause. The shape:
kind tells you whether you stopped before or after the named node. The step is what you pass back to resume.
Choosing a backend
| Use case | Pick |
|---|---|
| Tests, ephemeral demos | InMemoryCheckpointer |
| Single-process service, durable across restarts | SqliteCheckpointer |
| Multi-process service, shared state | PostgresCheckpointer |
| Anything custom (Redis, S3, your own DB) | implement Checkpointer<S> |
Subgraph isolation
Subgraphs usecheckpoint_ns to isolate their state from the parent. Nested graphs end up with namespaced run trees:
get_state_history on a subgraph only sees the sub-tree, so debugging is local.
How it works
- A checkpoint is taken after each superstep. That’s also when observers fire
OnCheckpoint. - Checkpointers serialize state.
S: Serializeis required for Sqlite / Postgres backends. The in-memory one clones. - Resume is exact.
resume(run_id, step, state, cfg)continues from the same superstep with the seeded state, preserving observer and metadata propagation. update_stateandresumeare independent. You can callupdate_statezero, one, or many times beforeresume.
See also
Human-in-the-loop
Pause, approve, edit, resume.
Patterns → HITL approval
A complete approval flow with checkpoints.
Production → Going to production
Picking a checkpointer for your stack.