LLMs are creative; some of their creativity gets you sued. This page covers the production patterns for limiting blast radius: redacting PII before it leaves your process, restricting which tools the model may call, protecting against SSRF when tools fetch URLs, and running the riskiest tools in sandboxes.Documentation Index
Fetch the complete documentation index at: https://cognis.vasanth.xyz/llms.txt
Use this file to discover all available pages before exploring further.
Threat surface
Three categories of risk you can mitigate at the framework level:| Risk | Mitigation |
|---|---|
| Sensitive data leaving your process in prompts | PiiRedactor / RegexRedactor middleware |
| Model calling tools you didn’t intend | ToolAllowList / ToolDenyList / approver |
| Tools fetching URLs they shouldn’t (SSRF, internal networks) | SSRF-protected HTTP client, allow-listed hosts |
| Tools executing arbitrary code | Sandbox Backends, PythonRepl with restricted env |
PII redaction
PiiRedactor masks known PII patterns (emails, phone numbers, credit-card-like sequences, common identifiers) before any prompt leaves Cognis:
RegexRedactor accepts your own patterns:
PiiRedactor for the obvious, RegexRedactor for your own patterns. To wire either into an AgentBuilder agent, see Middleware → Wiring middleware into an agent.
Tool deny-lists and allow-lists
Restrict which tools the model is offered at request time, via the middleware pipeline:LimitTools(n) caps how many tool definitions are sent in any single model call — useful when you have a large tool registry and want the model to focus.
For tool-call approval gating (require human sign-off before specific tools run, regardless of allow-list), use Approver + AgentBuilder::with_approver — see Human-in-the-loop.
Human-in-the-loop for the riskiest tools
For actions that should never run unattended — moving money, sending email, deleting data — use HITL approval. Even with allow-lists, anApprover ensures every sensitive call faces a human first.
SSRF protection
Tools that fetch URLs are SSRF-prone. The HTTP tool primitives incognis::tools::http (feature tools-http) protect against:
- requests to internal IP ranges (
10.0.0.0/8,172.16.0.0/12,192.168.0.0/16) - requests to link-local addresses (
169.254.0.0/16) - requests to localhost (
127.0.0.0/8,::1) - redirect chains that escape an allow-listed host
examples/resilience/ssrf_protection.rs.
Sandboxing tool execution
For tools that run code (Python, shell), use a sandboxBackend. The agent reads / writes through the backend; the sandbox enforces FS and network limits.
- The agent writes files (you don’t want it to write outside
./scratch). - The agent runs shell or Python (you don’t want it touching
~/.ssh).
Backend that shells out to your isolation provider.
Output filtering
Models sometimes emit content that shouldn’t go to users — internal reasoning, system-prompt leaks, training-data slips. For high-risk applications:- Strip CoT before sending to the user. Reasoning models emit
<thinking>blocks; filter them out. - Run a moderation pass. Use a smaller, fast model to classify the output before display.
- Cap message length.
CapMessageLengthmiddleware truncates if a runaway loop produces a megabyte of text.
Prompt injection
Cognis can’t fully prevent prompt injection — that’s an open problem in the field. But you can shrink the surface:- Don’t put untrusted text in the system prompt. System prompts should be your own.
- Quote untrusted input. Wrap user input in clearly delimited blocks (“USER MESSAGE START / END”) so the model has a chance to recognize boundaries.
- Restrict tool actions for untrusted contexts. A user-facing chat agent should not have a
delete_accounttool.
How it works
- Middleware runs in the agent loop, before each LLM call. PII redaction sees the rendered prompt; tool gates see the model’s tool calls.
- Approvers run between the model’s reply and tool dispatch. The model can’t bypass them; the dispatcher always asks first.
- Backends mediate file operations. Sandboxed implementations refuse paths outside the allow-list; the agent gets an error it can recover from.
- SSRF protection is in the HTTP client, not the framework. Bring your own client if you need different rules.
See also
Patterns → HITL approval
Approval workflow for sensitive tools.
Middleware
Full catalog including
EditPolicy and WorkspaceLister.Going to production
Deploying with these defaults on.