ClawRouter intelligently routes your API requests to the optimal model based on cost, latency, and task complexity. One API key, every LLM, self-learning.
Measured on 60-case labeled eval set · 2026-02-20
Watch a request get classified and routed through the pipeline in real time.
Your coding agent makes hundreds of calls per session. Most don't need the most expensive model.
File reads, summaries, classification, simple Q&A. 60% of agent calls are this simple.
Code generation, test writing, refactoring. The workhorse for most coding tasks.
Architecture, debugging complex issues, multi-file reasoning. Reserved for when it matters.
Without routing, every call uses the same model. With ClawRouter, you save up to 81% on API costs vs routing everything through Opus — measured on real traffic.
Every model: "auto" request passes through an intelligent pipeline that classifies, scores, and learns from outcomes.
Token limits, capability checks, tool-use detection. Sync, <1ms.
LLM classifier with circuit breaker + heuristic fallback. Async, <50ms.
Policy weights × judge output × EMA stats → best tier. Sync, <1ms.
Routes to selected model via LiteLLM with streaming, failover, and retries.
Fire-and-forget EMA update. Success rates and latency improve future routing.
Control the cost/quality tradeoff with the X-Routing-Policy header. Default: balanced.
Prefers fast → standard → reasoning
Prefers standard → fast → reasoning
Prefers reasoning → standard → fast
Prefers fast → standard → reasoning
Pick a tier or let auto decide for you.
| Tier | Model | Best for | Relative cost |
|---|---|---|---|
fast |
Claude Haiku 4.5 | Simple questions, classification, extraction | $ |
standard |
Claude Sonnet 4.6 | Code generation, writing, analysis | $$ |
reasoning |
Claude Opus 4.6 | Complex reasoning, architecture, research | $$$ |
auto |
Routes intelligently | Any task — classified, scored, and routed via the 5-stage pipeline | optimal |
The scoring engine evaluates each tier on fit, cost, latency, and capability. Simple questions go to Haiku, complex reasoning goes to Opus — automatically.
An in-cluster LLM classifier analyzes request complexity in <50ms with circuit breaker protection. Falls back to keyword heuristics if the judge is slow or down.
Every response updates per-model EMA statistics. Success rates and latency data continuously improve routing decisions over time.
Configurable regex redaction scrubs API keys, emails, SSNs, and PEM certs from prompts before the judge sees them. Your data stays private.
If a provider is down or rate-limited, ClawRouter retries with the next-best model automatically. Your app never sees the failure.
Drop-in replacement for any OpenAI SDK or client. Just change the base URL and API key — your existing code works as-is.
Add this to your OpenClaw config file:
{
"models": {
"providers": {
"clawrouter": {
"baseUrl": "https://api.clawrouter.app/v1",
"apiKey": "YOUR_API_KEY",
"api": "openai-completions",
"models": [
{ "id": "auto", "name": "Smart Route (auto-select best model)" },
{ "id": "fast", "name": "Fast (Haiku 4.5)" },
{ "id": "standard", "name": "Standard (Sonnet 4.6)" },
{ "id": "reasoning", "name": "Reasoning (Opus 4.6)" }
]
}
}
},
"agents": {
"defaults": {
"model": { "primary": "clawrouter/auto" }
}
}
}
Make a request with a routing policy:
curl https://api.clawrouter.app/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-H "X-Routing-Policy: balanced" \
-d '{
"model": "auto",
"messages": [
{"role": "user", "content": "Explain quicksort in one sentence"}
]
}'
# Response headers:
# x-gateway-routed-tier: fast
# x-gateway-judge-source: llm_judge
# x-gateway-judge-confidence: 0.92
# x-gateway-routing-policy: balanced
Works with the standard OpenAI Python SDK:
from openai import OpenAI
client = OpenAI(
base_url="https://api.clawrouter.app/v1",
api_key="YOUR_API_KEY",
)
response = client.chat.completions.create(
model="auto", # intelligently routed
messages=[{"role": "user", "content": "Hello"}],
extra_headers={"X-Routing-Policy": "balanced"},
)
print(response.choices[0].message.content)
Works with the standard OpenAI Node.js SDK:
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.clawrouter.app/v1",
apiKey: "YOUR_API_KEY",
});
const response = await client.chat.completions.create({
model: "auto", // intelligently routed
messages: [{ role: "user", content: "Hello" }],
}, { headers: { "X-Routing-Policy": "balanced" } });
console.log(response.choices[0].message.content);
Enter your email — we'll send you a magic link. No password needed.
Already have an account? Sign in to dashboard →