Once your traces are flowing into Langfuse, two more things become useful:Documentation Index
Fetch the complete documentation index at: https://cognis.vasanth.xyz/llms.txt
Use this file to discover all available pages before exploring further.
- Versioned prompts — keep the prompt text out of your code, change it without redeploying, A/B test in production.
- Evaluation scores — record how well a run did and tie it back to the trace. Build dashboards, regression alarms, and rolling quality checks.
cognis-trace and feature-gated behind langfuse.
Versioned prompts
Prompt carries:
| Field | Type | Notes |
|---|---|---|
name | String | Stable identifier. |
version | u32 | Monotonic. |
body | PromptBody | Either text(String) or chat(Vec<Message>). |
config | serde_json::Value | Free-form metadata you maintain alongside the prompt. |
labels | Vec<String> | E.g., production, staging, experiment-a. |
TracingHandler automatically stamps prompt_name and prompt_version on the resulting generation span — so you can filter “all calls using prompt v3” in Langfuse.
Submitting scores
Every run gets arun_id. Score it any time — during the run (in-band) or later (out-of-band).
- In-band
- Out-of-band
Submit a score directly through the handler:Useful for synchronous eval — your eval ran, you have the answer, attach it to the trace before the user sees the response.
Score values
ScoreValue | Use for |
|---|---|
Numeric(f64) | quality, novelty, helpfulness — anything on a continuous scale |
Categorical(String) | ”good” / “bad” / “skip”, custom labels |
Boolean(bool) | binary thumbs up/down |
Bring-your-own backend
If Langfuse isn’t your eval backend, implementScoreSink:
Arc::new(MyScorer) anywhere a ScoreSink is expected. The trait is one method.
How it works
- Prompt fetches are HTTP calls. Cache them locally if your access pattern is “fetch on every request” —
client.getis fast but it’s still a round-trip. - Score submission is async and best-effort. The Langfuse scorer batches in the background like the trace exporter; failures are logged.
- Run IDs link the world. A trace, its scores, the prompt version it used — all keyed on the same
run_idCognis generated for the run.
See also
Evaluation
Run evals over a dataset and feed scores back.
Trace with Langfuse
Where the runs themselves land.