Examples
Simple local usage:
ollama pull llama3.1:70b ollama pull llama3.1:8b
python3 cli.py exec \ --orchestrator ollama:llama3.1:70b \ --worker ollama:llama3.1:8b \ --task "Summarize 10 news articles"
This runs a planner + worker flow fully locally.
Hybrid cloud + local usage:
export ANTHROPIC_API_KEY="sk-ant‑..." ollama pull llama3.1:8b
python3 cli.py exec \ --orchestrator anthropic:claude-3-7-sonnet-20250219 \ --worker ollama:llama3.1:8b \ --task "Compare 5 products"
export ANTHROPIC_API_KEY="sk-ant‑..." ollama pull llama3.1:8b
python3 cli.py exec \ --orchestrator anthropic:claude-3-7-sonnet-20250219 \ --worker ollama:llama3.1:8b \ --task "Compare 5 products"
Routes tasks between cloud provider models and a local worker.
TUI chat mode:
python3 cli.py chat \ --orchestrator anthropic:claude-3 \ --worker ollama:llama3.1:8b
Interactive CLI chat with live logs and cost breakdown.
Why it matters • Orchestrate multiple LLMs — OpenAI, Anthropic, Ollama/llama.cpp — without writing custom routing logic. • Smart routing and fallback — choose better models for each task and fall back heuristically or learned over time. • Cost tracking & session logs — see costs per run and preserve history locally. • Optional scraping + caching — enrich tasks with real web data if needed. • Optional MCP server integration — serve llm‑use workflows via PolyMCP.
llm‑use makes it easier to build robust, multi‑model LLM systems without being tied to a single API or manual orchestration.