A research assistant is the canonical “more than one agent” use case — three specialists handing work down a pipeline. A planner breaks a question into steps, a researcher gathers evidence, a writer assembles a coherent report. We’ll wire it up withDocumentation Index
Fetch the complete documentation index at: https://cognis.vasanth.xyz/llms.txt
Use this file to discover all available pages before exploring further.
Sequential orchestration and a search tool.
What you’ll build
A binary that, given a question, returns a one-page report with citations. About 80 lines of code; works against any provider; the same shape scales to longer reports if you add more agents.How it works
- Planner — receives the question, returns a numbered list of subquestions.
- Researcher — receives the plan, calls a
searchtool for each subquestion, returns gathered evidence. - Writer — receives the plan + evidence, returns the final report.
- Sequential orchestration passes each agent’s output as the next agent’s input.
Step 1 — A search tool
Pick any search provider. For the example we’ll use a stub; in production, plug in Tavily, Brave, or your internal search.Step 2 — Three agents
How it works
Sequentialruns agents in registration order; each receives the previous reply as its input. The final agent’s reply isresp.content.- The researcher loops over subquestions inside its own ReAct loop. Each
searchcall is one iteration;with_max_iterations(8)lets it cover a 5-step plan with a couple of retries. - No shared state — just text passing between agents. That’s deliberate. Each agent has a single job, a single prompt, and a single tool surface.
- The writer never calls tools. It only synthesizes. Keeping the writer toolless prevents it from re-doing research and keeps the final pass cheap.
Make it production-ready
When you’re ready to ship, layer on:| Concern | What to add |
|---|---|
| Real search | A Tavily / Brave / SerpAPI tool — see Tools. |
| Cost control | RateLimit::new(Arc::new(TokenBucket::new(rate, burst))) and ModelCallLimit::new(n) in a MiddlewarePipeline around your Client. |
| Tracing | Wire cognis-trace so each agent appears as a nested span in Langfuse. |
| Caching | Cache search results with a CachedRetriever-style wrapper around the tool. |
| Eval | Build an EvalRunner over a small set of known good questions and an LlmJudge evaluator. |
| Streaming UI | orch.stream_events(...) and forward OnLlmToken to your frontend — see Patterns → Streaming UI. |
Variations
- Add a critic. Slot a fourth agent that reviews the writer’s draft and returns suggestions; loop with
RoundRobinfor a couple of revision passes. - Parallel research. Replace the Sequential orchestrator with a custom
HandoffStrategythat fan-outs subquestions to parallel researcher agents and folds their answers. - Long context. For deep research, swap the writer for a long-context model and have the researcher dump full search snippets — pair with Long-context summarization.
See also
Multi-agent orchestration
Strategies, custom handoffs.
Tools
The tool the researcher calls.
Patterns → Multi-agent debate
A different orchestration shape.