So I built NadirClaw. It's a Python proxy that sits between your app and your LLM providers. It classifies each prompt in about 10ms and routes simple ones to Gemini Flash, Ollama, or whatever cheap/local model you want. Only the complex prompts hit your premium API.
It's OpenAI-compatible, so you just point your existing tools at it. Works with OpenClaw, Cursor, Claude Code, or anything that talks to the OpenAI API.
In practice I went from burning through my Claude quota in 2 days to having it last the full week. Costs dropped around 60%.
curl -fsSL https://raw.githubusercontent.com/doramirdor/NadirClaw/main/... | sh
Still early. The classifier is simple (token count + pattern matching + optional embeddings), and I'm sure there are edge cases I'm missing. Curious what breaks first, and whether the routing logic makes sense to others.
amirdor•1h ago
NadirClaw is a Python proxy that classifies each prompt in ~10ms and routes simple ones to Gemini Flash, Ollama, or whatever cheap model you want. Only complex prompts hit your premium API. It's OpenAI-compatible, so you just point your existing tools at it.
In practice I went from burning through my Claude quota in 2 days to having it last the full week. Costs dropped around 60%.
pip install nadirclaw
Still early. The classifier is simple (token count + pattern matching + optional embeddings). Curious what breaks first.