What it does: You type one prompt, pick 2–6 models (we support 47 text models and several image models across OpenAI, Anthropic, Google, xAI, Meta, Amazon, Mistral, Cohere, AI21), and see every response side-by-side. There's also a feature called SmartPick that uses an LLM evaluator to score each response on Clarity, Accuracy, Completeness, and Helpfulness — useful when you're comparing 6 models and don't want to read everything carefully.
Beyond comparison, there are two other things I've built:
Workspaces — You can create multi-panel layouts where each panel has its own model, system prompt, and conversation history. So instead of "Hey ChatGPT, you're a code reviewer" every time, you set it once and the panel remembers. I use a "Customer Support" workspace with 6 panels daily — Ticket Drafter on Claude Haiku, Escalation Handler on Sonnet, Knowledge Base Builder on GPT-4o, etc.
Prompts Library — Hundreds of prompts across 10 categories. Less interesting technically, but saves a surprising amount of time.
Some things HN might care about: * No API keys needed — we handle all provider connections * Private Mode does zero-trace testing (nothing stored, nothing logged) * Everything is encrypted at rest * Image generation comparison works too (ChatGPT Image vs Grok Imagine vs Gemini) * Free tier exists with limited models and capacity. Paid tiers are $29/$59/$99.
Tech stack if anyone's curious: AWS (DynamoDB, Lambda, SQS, S3), with separate provider integrations for each AI model. The tricky part was building context management for multi-turn conversations across different providers — each has its own message format, token limits, and quirks.
We hit #11 on Product Hunt last year when we launched and have ~15K users. But honestly the feedback I most want is from this community — what's missing, what's broken, what would make you actually use this daily?
Happy to answer any questions about the architecture, pricing model, or anything else.