axum (or actix, or warp, or hyper) endpoint that runs an agent and streams structured events out as Server-Sent Events. The shape works for any frontend that can read SSE.
What you’ll build
A POST endpoint at/chat that takes a user message, runs an agent, and streams tokens, tool starts, and tool ends back to the client until the agent finishes.
How it works
agent.stream_events(...)turns the agent into an event stream — same Runnable contract everywhere else uses.- An axum handler maps each
Eventto an SSE frame. - The frontend reads the SSE stream and renders tokens as they arrive, plus a sidebar of tool events.
The handler
axum, tokio, serde, and serde_json to your Cargo.toml.
The frontend
How it works
stream_eventsstarts the agent immediately and returns a stream of events. The first frames arrive before the agent finishes.- Per-message agent ownership. The handler locks the agent for one request — fine for low-volume; for high-throughput, build a fresh agent per request (cheap; just a builder pass).
- Backpressure is real. If the client is slow, the channel fills and the agent eventually blocks. SSE typically streams fast enough; for slow clients, use a bounded channel and drop on overflow.
- Errors are events. An
OnErrorframe ends the stream; the frontend should display it instead of treating it as a stuck connection.
Variations
| Variation | What changes |
|---|---|
| WebSockets | Same logic; emit JSON-encoded frames over a WebSocket message instead of SSE. |
| Server-side filtering | If your frontend only renders tokens, drop everything except OnLlmToken before yielding. |
| Multi-tenant | Add a thread_id header → pass to RunnableConfig → wire a Checkpointer for per-user state. See Stateful chat. |
| Cancellation | Pass a CancellationToken on the config; cancel it when the client disconnects. |
Pair with tracing
Wirecognis-trace and the same events stream into Langfuse — see Trace with Langfuse. Now you have a UI showing tokens to the user and a trace dashboard showing every run.
See also
Building agents → Streaming
The Runnable-level streaming surface.
Graph workflows → Streaming
Filter and custom event channels.
Patterns → Stateful chat
Add memory and persistence.