Models and providers

Cognis abstracts LLMs behind a single trait — LLMProvider — and bundles concrete clients for the major vendors. Most code touches Client, the provider-agnostic wrapper. Everything below it (request shapes, auth, streaming, tool serialization) is provider-specific and feature-gated.

What it is

Client is Runnable<Vec<Message>, Message> plus a few convenience methods. It wraps an Arc<dyn LLMProvider>, so swapping providers means changing one constructor call, not your chain.

use cognis::prelude::*;

let client = Client::from_env()?;
let reply: Message = client.invoke(vec![
    Message::system("You are a careful assistant."),
    Message::human("Summarize Rust ownership in one sentence."),
]).await?;
println!("{}", reply.content());

Two ways to construct a Client

Use Client::from_env when env vars decide the provider — by far the most common path. Use provider builders when you need provider-specific knobs (organization id, deployment name, custom headers).

From env (recommended)
Builder
Custom provider

Reads COGNIS_PROVIDER plus matching COGNIS_<PROVIDER>_* variables.

use cognis_llm::Client;
let client = Client::from_env()?;

See Installation → Set credentials for the full env-var table.

Fluent builder for fine-grained control.

use cognis_llm::{Client, provider::Provider};
let client = Client::builder()
    .provider(Provider::OpenAI)
    .api_key(std::env::var("OPENAI_API_KEY")?)
    .model("gpt-4o-mini")
    .timeout_secs(60)
    .build()?;

Wrap any Arc<dyn LLMProvider> — useful for tests, gateways, or self-hosted backends.

use std::sync::Arc;
use cognis_llm::Client;

let client = Client::new(Arc::new(MyCustomProvider));

Switching providers

The same agent, six ways. Same code; different env or different provider builder.

use cognis_llm::provider::openai::OpenAIBuilder;

let provider = OpenAIBuilder::default()
    .api_key(std::env::var("OPENAI_API_KEY")?)
    .model("gpt-4o-mini")
    .organization("org-123")
    .build()?;
let client = cognis_llm::Client::new(std::sync::Arc::new(provider));

use cognis_llm::provider::anthropic::AnthropicBuilder;

let provider = AnthropicBuilder::default()
    .api_key(std::env::var("ANTHROPIC_API_KEY")?)
    .model("claude-sonnet-4")
    .build()?;

use cognis_llm::provider::google::GoogleBuilder;

let provider = GoogleBuilder::default()
    .api_key(std::env::var("GOOGLE_API_KEY")?)
    .model("gemini-2.0-flash")
    .build()?;

No key needed. Uses your local Ollama daemon.

use cognis_llm::provider::ollama::OllamaBuilder;

let provider = OllamaBuilder::default()
    .base_url("http://localhost:11434")
    .model("llama3.1")
    .build()?;

use cognis_llm::provider::azure::AzureBuilder;

let provider = AzureBuilder::default()
    .endpoint(std::env::var("AZURE_OPENAI_ENDPOINT")?)
    .deployment("gpt-4o-prod")
    .api_version("2024-08-06")
    .api_key(std::env::var("AZURE_OPENAI_API_KEY")?)
    .build()?;

Adds attribution headers and supports any OpenAI-compatible model.

use cognis_llm::provider::openrouter::OpenRouterBuilder;

let provider = OpenRouterBuilder::default()
    .api_key(std::env::var("OPENROUTER_API_KEY")?)
    .model("anthropic/claude-sonnet-4")
    .extra_header("HTTP-Referer", "https://yourapp.com")
    .extra_header("X-Title", "your-app")
    .build()?;

What you can do with a Client

Method	Returns	Use when
`invoke(messages)`	`Message`	One-shot chat — fastest.
`stream(messages)`	`RunnableStream<StreamChunk>`	Token-by-token streaming.
`chat(messages, ChatOptions)`	`ChatResponse`	Need `usage`, `finish_reason`, `model` in the result.
`invoke_with_tools(messages, &[Arc<dyn Tool>])`	`Message`	One-shot with tools — but for full agentic loops, use `AgentBuilder`.

How it works

Client doesn’t know the provider’s wire format. LLMProvider does. Client packages messages into a generic request and lets the provider serialize.
Client is a Runnable. Wrap it with with_max_retries, with_timeout, with_fallback — same as anything else.
Tool calls are normalized. Whatever the provider returns (OpenAI’s tool_calls, Anthropic’s tool_use blocks, Gemini’s functionCalls), Cognis flattens to AiMessage.tool_calls: Vec<ToolCall>.
Streaming aggregates correctly. A streamed reply that includes a tool call decides — at the chunk level — to enter tool-dispatch mode without breaking the consumer.

Resilience patterns

Models fail. Cognis ships idiomatic recovery wrappers:

use std::time::Duration;
use cognis::prelude::*;

let resilient = Client::from_env()?
    .with_max_retries(3)
    .with_timeout(Duration::from_secs(30))
    .with_fallback(another_client);

For richer policies (cost-based retry, exponential backoff with jitter), see Production → Resilience.

Tools

Give the model something to call.

Streaming

Tokens, events, and structured streams.

Structured output

Get typed structs back from the model.

Reference → cognis-llm

Full provider list and method signatures.

Get started

Core ideas

Building agents

Building RAG

Graph workflows

Observability

Patterns

Production

Models and providers

What it is

Two ways to construct a Client

Switching providers

What you can do with a Client

How it works

Resilience patterns

See also

Tools

Streaming

Structured output

Reference → cognis-llm

Get started

Core ideas

Building agents

Building RAG

Graph workflows

Observability

Patterns

Production

Documentation Index

​What it is

​Two ways to construct a Client

​Switching providers

​What you can do with a Client

​How it works

​Resilience patterns

​See also

Tools

Streaming

Structured output

Reference → cognis-llm

What it is

Two ways to construct a Client

Switching providers

What you can do with a Client

How it works

Resilience patterns

See also