I built a local-first UI that adds two reasoning architectures on top of small models like Qwen, Llama and Mistral: a sequential Thinking Pipeline (Plan → Execute → Critique) and a parallel Agent Council where multiple expert models debate in parallel and a Judge synthesizes the best answer. No API keys, zero .env setup — just pip install multimind. Benchmark on GSM8K shows measurable accuracy gains vs. single-model inference.
Comments
selfradiance•1h ago
The Agent Council approach is interesting — having multiple small models debate in parallel and a judge synthesize feels like a more principled version of what people do manually when they cross-check answers between Claude, GPT, and Gemini. Curious whether the GSM8K gains hold up on less structured tasks where there isn't a single correct answer (e.g. summarization or open-ended reasoning).
selfradiance•1h ago