frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Nestlé couldn't crack Japan's coffee market.Then they hired a child psychologist

https://twitter.com/BigBrainMkting/status/2019792335509541220
1•rmason•1m ago•0 comments

Notes for February 2-7

https://taoofmac.com/space/notes/2026/02/07/2000
2•rcarmo•2m ago•0 comments

Study confirms experience beats youthful enthusiasm

https://www.theregister.com/2026/02/07/boomers_vs_zoomers_workplace/
1•Willingham•9m ago•0 comments

The Big Hunger by Walter J Miller, Jr. (1952)

https://lauriepenny.substack.com/p/the-big-hunger
1•shervinafshar•10m ago•0 comments

The Genus Amanita

https://www.mushroomexpert.com/amanita.html
1•rolph•15m ago•0 comments

We have broken SHA-1 in practice

https://shattered.io/
2•mooreds•16m ago•1 comments

Ask HN: Was my first management job bad, or is this what management is like?

1•Buttons840•17m ago•0 comments

Ask HN: How to Reduce Time Spent Crimping?

1•pinkmuffinere•18m ago•0 comments

KV Cache Transform Coding for Compact Storage in LLM Inference

https://arxiv.org/abs/2511.01815
1•walterbell•23m ago•0 comments

A quantitative, multimodal wearable bioelectronic device for stress assessment

https://www.nature.com/articles/s41467-025-67747-9
1•PaulHoule•25m ago•0 comments

Why Big Tech Is Throwing Cash into India in Quest for AI Supremacy

https://www.wsj.com/world/india/why-big-tech-is-throwing-cash-into-india-in-quest-for-ai-supremac...
1•saikatsg•25m ago•0 comments

How to shoot yourself in the foot – 2026 edition

https://github.com/aweussom/HowToShootYourselfInTheFoot
1•aweussom•25m ago•0 comments

Eight More Months of Agents

https://crawshaw.io/blog/eight-more-months-of-agents
3•archb•27m ago•0 comments

From Human Thought to Machine Coordination

https://www.psychologytoday.com/us/blog/the-digital-self/202602/from-human-thought-to-machine-coo...
1•walterbell•28m ago•0 comments

The new X API pricing must be a joke

https://developer.x.com/
1•danver0•29m ago•0 comments

Show HN: RMA Dashboard fast SAST results for monorepos (SARIF and triage)

https://rma-dashboard.bukhari-kibuka7.workers.dev/
1•bumahkib7•29m ago•0 comments

Show HN: Source code graphRAG for Java/Kotlin development based on jQAssistant

https://github.com/2015xli/jqassistant-graph-rag
1•artigent•34m ago•0 comments

Python Only Has One Real Competitor

https://mccue.dev/pages/2-6-26-python-competitor
4•dragandj•35m ago•0 comments

Tmux to Zellij (and Back)

https://www.mauriciopoppe.com/notes/tmux-to-zellij/
1•maurizzzio•36m ago•1 comments

Ask HN: How are you using specialized agents to accelerate your work?

1•otterley•37m ago•0 comments

Passing user_id through 6 services? OTel Baggage fixes this

https://signoz.io/blog/otel-baggage/
1•pranay01•38m ago•0 comments

DavMail Pop/IMAP/SMTP/Caldav/Carddav/LDAP Exchange Gateway

https://davmail.sourceforge.net/
1•todsacerdoti•39m ago•0 comments

Visual data modelling in the browser (open source)

https://github.com/sqlmodel/sqlmodel
1•Sean766•41m ago•0 comments

Show HN: Tharos – CLI to find and autofix security bugs using local LLMs

https://github.com/chinonsochikelue/tharos
1•fluantix•41m ago•0 comments

Oddly Simple GUI Programs

https://simonsafar.com/2024/win32_lights/
1•MaximilianEmel•42m ago•0 comments

The New Playbook for Leaders [pdf]

https://www.ibli.com/IBLI%20OnePagers%20The%20Plays%20Summarized.pdf
1•mooreds•42m ago•1 comments

Interactive Unboxing of J Dilla's Donuts

https://donuts20.vercel.app
1•sngahane•44m ago•0 comments

OneCourt helps blind and low-vision fans to track Super Bowl live

https://www.dezeen.com/2026/02/06/onecourt-tactile-device-super-bowl-blind-low-vision-fans/
1•gaws•45m ago•0 comments

Rudolf Vrba

https://en.wikipedia.org/wiki/Rudolf_Vrba
1•mooreds•46m ago•0 comments

Autism Incidence in Girls and Boys May Be Nearly Equal, Study Suggests

https://www.medpagetoday.com/neurology/autism/119747
1•paulpauper•47m ago•0 comments
Open in hackernews

My analysis of 439 models proves: You're overpaying for your LLMs

https://whatllm.vercel.app/
7•demian101•6mo ago

Comments

demian101•6mo ago
While everyone's geeking out over Grok4's insane physics sims and Kimi K2's 1T OS bombshell (crushing coding benchmarks for pennies), the real AI drama is in the pricing shadows. After my LLM Selector post blew up here, I kept getting DMs asking "but which provider should I actually use?" So I dove deep into 439 models across 63 providers.

What I found? some interesting insights:

1. huge markup on identical models Take DeepSeek R1 0528 (quality 68 from Artificial analysis bench, beats many flagships):

Completely free on Google Vertex and CentML (decent speeds too, 121 tok/s and 87 tok/s).

But jumps to $0.91 on Deepinfra, $4.25 on Fireworks Fast, and a whopping $5.50 on SambaNova, for the exact same model (ofc with speed differences).

Arbitrage alert: Why pay infinite markup when free tiers deliver the goods for experimentation or bulk runs?

2. Latency goldmines hiding in plain sight Sub millisecond responses aren't just for premium setups:

Nebius Base crushes it with DeepSeek R1 at 0.61ms latency for $1.00/1M (103 tok/s) and Qwen3 235B at 0.56ms for $0.30/1M (50 tok/s).

Groq takes it further with models like Qwen3 32B at 0.14ms for $0.36/1M (627 tok/s).

Arbitrage alert: These blow away slower "enterprise" options costing 10x more, ideal for real-time apps

3. speed demons with massive throughput gaps Hardware optimization creates wild performance swings:

Cerebras with Qwen3 32B at 2,496 tok/s for $0.50/1M and Llama 4 Scout at 2,808 tok/s for $0.70/1M.

Compare to the same models elsewhere: Often stuck at 40-80 tok/s for similar or higher prices.

Arbitrage alert: 50x+ throughput boosts on the same model?

4. Quality overpays that defy logic High-quality doesn't mean high-price anymore:

Qwen3 235B (quality 62) at $0.10/1M on Fireworks (79 tok/s): outperforms Claude 4 Opus (quality 58) which costs $30/1M everywhere (19-65 tok/s).

Grok 3 mini (quality 67) at $0.35/1M on xAI (210 tok/s), edging out pricier closed source rivals.

Arbitrage alert: 300x cheaper for better quality? Open-source gems like these make "premium" models look like rip-offs lol

5. Provider flips on big-name models Even giants like OpenAI show huge variances:

GPT-4.1 mini ($0.70/1M): Azure blasts 217 tok/s vs OpenAI's 73 tok/s.

o3 ($3.50/1M): OpenAI hits 199 tok/s vs Azure's slower 99 tok/s (with double the latency).

Arbitrage alert: Same price, but 3x throughput or half the latency? Picking the right endpoint saves thousands on production workloads.

We're in the Wild West of pricing amid all the hype. Big names coast on reputation, but smaller providers like Nebius and Cerebras optimize like mad.

Open-source crushes closed-source on value: top 20 price-perf plays are ALL open.

What should you do?

Stop assuming expensive = better

Hunt latency and speed arbitrages (they're everywhere)

Test specialised providers for throughput wins

Grab sub-$0.50 open-source beasts (like Qwen3 or Grok mini)

Exploit these gaps now before "normalization" hits

Centralised all the data from Artificial analysis on whatllm.com, and insights are the real gold.

Found crazier arbitrages? Spill in comments!

which hype are you actually buying, and why?

This rabbit hole hit harder than any benchmark!

Happy to geek out more!