frontpage.

Hi HN!

I built LLMKit after getting frustrated with choosing the right LLM for different projects. Instead of guessing or relying on benchmarks that don't match real use cases, I wanted to see actual performance with my own prompts.

What it does: • Compare up to 5 models simultaneously (GPT-4, Claude, Gemini, etc.) • Real-time streaming comparison - watch models race to respond • Custom scoring weights based on your priorities (speed vs cost vs quality) • System prompt support for production-realistic testing • TTFT (Time to First Token) metrics for latency-sensitive apps • No signup required, API keys stay in your browser

The "aha moment" was adding streaming comparison - seeing GPT-4 start fast but Claude catch up, or watching cost-effective models perform surprisingly well. It's like A/B testing but for LLMs.

Built with Next.js + TypeScript. The streaming implementation was tricky - had to handle different provider formats (OpenAI vs Anthropic) and parallel SSE connections.

Ask HN: How much of your token use is fixing the bugs Claude Code causes?

Show HN: Agents – Sync MCP Configs Across Claude, Cursor, Codex Automatically

Hello

FSD helped save my father's life during a heart attack

Show HN: Writtte – Draft and publish articles without reformatting, anywhere

Portuguese icon (FROM A CAN) makes a simple meal (Canned Fish Files) [video]

Brookhaven Lab's RHIC Concludes 25-Year Run with Final Collisions

Transcribe your aunts post cards with Gemini 3 Pro

.72% Variance Lance

ReKindle – web-based operating system designed specifically for E-ink devices

Encrypt It

NextMatch – 5-minute video speed dating to reduce ghosting

Personalizing esketamine treatment in TRD and TRBD

SpaceKit.xyz – a browser‑native VM for decentralized compute

NotebookLM: The AI that only learns from you

Show HN: An open-source starter kit for developing with Postgres and ClickHouse

Game Boy Advance d-pad capacitor measurements

South Korean crypto firm accidentally sends $44B in bitcoins to users

Apache Poison Fountain

Web.whatsapp.com appears to be having issues syncing and sending messages

Google in Your Terminal

Shannon: Claude Code for Pen Testing: #1 on Github today

Anthropic: Latest Claude model finds more than 500 vulnerabilities

Brooklyn cemetery plans human composting option, stirring interest and debate

Why the 'Strivers' Are Right

Brain Dumps as a Literary Form

Agentic Coding and the Problem of Oracles

Malicious packages for dYdX cryptocurrency exchange empties user wallets

Show HN: I built a <400ms latency voice agent that runs on a 4gb vram GTX 1650"

Penisgate erupts at Olympics; scandal exposes risks of bulking your bulge

Show HN: LLMKit – Compare LLMs side-by-side with real-time streaming