Use this file to discover all available pages before exploring further.
You can build a complete Cognis app — agent, RAG, even multi-agent — without leaving localhost. Ollama runs LLMs and embedders locally; Cognis ships clients for both. This pattern walks through a fully local research-style assistant with an in-memory vector store.
You can substitute any model: qwen2.5:3b, phi3, mistral-nemo, etc. For tool-calling, prefer models that support function calling natively (llama3.1, qwen2.5, mistral-nemo).
use std::sync::Arc;use cognis::prelude::*;use cognis::{AgentBuilder, Calculator};use cognis_llm::Client;#[tokio::main]async fn main() -> Result<()> { let client = Client::from_env()?; let mut agent = AgentBuilder::new() .with_llm(client) .with_tool(Arc::new(Calculator::new())) .with_system_prompt( "You are a math assistant. Use the calculator for any arithmetic. \ Always state the final answer clearly." ) .with_max_iterations(4) .build()?; let resp = agent.run(Message::human("What is 23 * 17 + 4?")).await?; println!("{}", resp.content); Ok(())}
If your model is small (llama3.2:1b etc), tool calling can be flaky — switch to a model that’s known to handle it (llama3.1, qwen2.5).
use cognis::prelude::*;use futures::StreamExt;use cognis_llm::Client;#[tokio::main]async fn main() -> Result<()> { let client = Client::from_env()?; let mut s = client.stream(vec![Message::human("Tell me a one-line joke.")]).await?; while let Some(chunk) = s.next().await { print!("{}", chunk?.content); } println!(); Ok(())}