frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Yardstiq – Compare LLM outputs side-by-side in your terminal

https://www.yardstiq.sh
2•stanleycyang•2h ago
Hey HN,

I built yardstiq because I got tired of the copy-paste workflow for comparing LLM responses when developing apps. Every time I wanted to see how Claude vs GPT vs Gemini handled the same prompt, I'd open three tabs, paste the same thing, and try to eyeball the differences. It's 2026 and we have 40+ models worth considering — that doesn't scale.

yardstiq is a CLI tool that sends one prompt to multiple models simultaneously and streams the responses side-by-side in your terminal. It also tracks performance metrics (time to first token, tokens/sec, cost) and optionally runs an AI judge to score the outputs.

``` npx yardstiq "Explain quicksort in 3 sentences" -m claude-sonnet -m gpt-4o ```

What it does:

- Streams responses from multiple models in parallel, rendered in columns - Shows TTFT, throughput (tok/s), token counts, and cost per request - AI judge mode: have a model evaluate and score the responses - Export to JSON, Markdown, or self-contained HTML reports - Run YAML-defined benchmark suites across models with aggregate scoring - Works with Ollama for local model comparisons (zero API cost) - Supports 40+ models via direct provider keys or Vercel AI Gateway

I built this mostly for my own workflow — picking models for different tasks, testing prompt variations, and running quick benchmarks without setting up a whole evaluation framework. It's not trying to replace serious eval platforms, just make the "which model is better for X?" question answerable in 10 seconds.

MIT licensed, written in TypeScript: https://github.com/stanleycyang/yardstiq

Happy to answer questions about the architecture or benchmarking approach.

Intel's make-or-break 18A process node debuts for data center with 288-core Xeon

https://www.tomshardware.com/pc-components/cpus/intels-make-or-break-18a-process-node-debuts-for-...
1•vanburen•2m ago•0 comments

Silent Backwards Compatibility Breaking Changes in PyTorch

https://blog.ezyang.com/2026/03/silent-bc-breaking-changes/
1•matt_d•5m ago•0 comments

Hacked traffic cameras & US Intel: How plot to kill Iran's leader came together

https://www.cnn.com/2026/03/03/middleeast/us-israel-plot-kill-iran-khamenei-latam-intl
1•CGMthrowaway•5m ago•0 comments

Claude Code escapes its own denylist and sandbox

https://ona.com/stories/how-claude-code-escapes-its-own-denylist-and-sandbox
1•tomvault•6m ago•1 comments

I Built a Spy Satellite Simulator in a Browser. Here's What I Learned

https://www.spatialintelligence.ai/p/i-built-a-spy-satellite-simulator
1•CGMthrowaway•7m ago•0 comments

LotusQ Cross platform voice dictation with free local Whisper(Mac/Windows/Linux)

1•nkodev•7m ago•1 comments

The gap between ICP documents and buyer understanding in B2B sales

https://artemisgtm.ai/blog/why-most-b2b-companies-get-icp-wrong
1•thegtmauditguy•9m ago•1 comments

Academics Need to Wake Up on AI

https://alexanderkustov.substack.com/p/academics-need-to-wake-up-on-ai
1•verdverm•9m ago•0 comments

Qwen Tech Lead Steps Down

https://twitter.com/JustinLin610/status/2028865835373359513
1•informal007•9m ago•0 comments

Fire the CEO, Introducing the AxO's

https://boringops.sh/articles/fire_the_ceo/
1•boringops-dan•9m ago•0 comments

Mpv Is the MVP of Video and Image Viewing

https://nickjanetakis.com/blog/mpv-is-the-mvp-of-video-and-image-viewing
1•nickjj•10m ago•0 comments

Deprecate confusing APIs like "os.path.commonprefix()"

https://sethmlarson.dev/deprecate-confusing-apis-like-os-path-commonprefix
1•todsacerdoti•10m ago•0 comments

Ask HN: Using AI at work is stupidity, or a good tool if used properly?

1•MrLey•15m ago•0 comments

How HN: DocAPI – HTTP 402 as designed: agents register, pay USDC, run forever

https://www.docapi.co
1•siwandev•17m ago•1 comments

Why exe.dev VMs are persistent

https://blog.exe.dev/persistent
2•tosh•17m ago•0 comments

Gram 1.0 Released

https://gram.liten.app/posts/first-release/
1•birdculture•19m ago•0 comments

OpenAI releases GPT-5.3 Instant update to make ChatGPT less 'cringe'

https://9to5mac.com/2026/03/03/openai-releases-gpt-5-3-instant-update-to-make-chatgpt-less-cringe/
1•HiroProtagonist•20m ago•0 comments

Beatport and Beatsource to Unite into One Premium DJ Platform

https://www.beatportal.com/articles/1291036-beatport-and-beatsource-to-unite-into-one-premium-dj-...
1•DocFeind•20m ago•0 comments

Identity Formation and the Politics of Belonging: Bengali Migrants in Kerala [pdf]

https://www.aijfr.com/papers/2025/5/1400.pdf
1•thunderbong•21m ago•0 comments

Ask HN: What are your go to sources for relatively unbiased global news?

1•Jimmc414•21m ago•0 comments

Show HN: Voquill, an open source and cross-platform alternative to wisprflow

https://github.com/josiahsrc/voquill
1•josiahsrc•22m ago•0 comments

The unfortunate need for an "age verification" API for legal compliance

https://lists.ubuntu.com/archives/ubuntu-devel/2026-March/043510.html
2•turrini•22m ago•1 comments

OpenclawwOpenClaw Partners with VirusTotal for Skill Security

https://openclaw.ai/blog/virustotal-partnership
1•breitkreutz•23m ago•0 comments

Blocking a brain receptor may calm blood pressure signals

https://medicalxpress.com/news/2026-02-clue-hypertension-blocking-brain-receptor.html
2•PaulHoule•24m ago•0 comments

Show HN: Mozilla.ai introduces Clawbolt, an AI Assistant for the trades

https://github.com/mozilla-ai/clawbolt
7•river_otter•25m ago•0 comments

Claude and Pentagon whole fight timeline

https://www.youtube.com/watch?v=Ph8CrTNlWbM
2•ashutosh0707•26m ago•0 comments

New tool for designing software architecture diagrams and presentations

https://savnet.co/networks/designer
1•oscarricardosan•26m ago•0 comments

Section 230 is the best protection we have from Trump's censorship

https://www.ms.now/opinion/section-230-trump-free-speech
1•01-_-•26m ago•0 comments

Cofounder search: An internet-native way to do ML and bio research

https://labless.bio
1•jeremykalfus•27m ago•1 comments

The Making of the Atomic Bomb book predicted the AI crisis before it happened

https://blog.adafruit.com/2026/03/03/the-making-of-the-atomic-bomb-1986-by-richard-rhodes/
1•ptorrone•27m ago•0 comments