frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Mercury 2: The fastest reasoning LLM, powered by diffusion

https://www.inceptionlabs.ai/blog/introducing-mercury-2
54•fittingopposite•3h ago

Comments

dvt•1h ago
What excites me most about these new 4figure/second token models is that you can essentially do multi-shot prompting (+ nudging) and the user doesn't even feel it, potentially fixing some of the weird hallucinatory/non-deterministic behavior we sometimes end up with.
tl2do•1h ago
Genuine question: what kinds of workloads benefit most from this speed? In my coding use, I still hit limitations even with stronger models, so I'm interested in where a much faster model changes the outcome rather than just reducing latency.
irthomasthomas•1h ago
multi-model arbitration, synthesis, parallel reasoning etc. Judging large models with small models is quite effective.
layoric•1h ago
I think it would assist in exploiting exploring multiple solution spaces in parallel, and can see with the right user in the loop + tools like compilers, static analysis, tests, etc wrapped harness, be able to iterate very quickly on multiple solutions. An example might be, "I need to optimize this SQL query" pointed to a locally running postgres. Multiple changes could be tested, combined, and explain plan to validate performance vs a test for correct results. Then only valid solutions could be presented to developer for review. I don't personally care about the models 'opinion' or recommendations, using them for architectural choices IMO is a flawed use as a coding tool.

It doesn't change the fact that the most important thing is verification/validation of their output either from tools, developer reviewing/making decisions. But even if don't want that approach, diffusion models are just a lot more efficient it seems. I'm interested to see if they are just a better match common developer tasks to assist with validation/verification systems, not just writing (likely wrong) code faster.

cjbarber•1h ago
I've tried a few computer use and browser use tools and they feel relatively tok/s bottlenecked.

And in some sense, all of my claude code usage feels tok/s bottlenecked. There's never really a time where I'm glad to wait for the tokens, I'd always prefer faster.

quotemstr•1m ago
Once you make a model fast and small enough, it starts to become practical to use LLMs for things as mundane as spell checking, touchscreen-keyboard tap disambiguation, and database query planning.
cjbarber•1h ago
It could be interesting to do the metric of intelligence per second.

ie intelligence per token, and then tokens per second

My current feel is that if Sonnet 4.6 was 5x faster than Opus 4.6, I'd be primarily using Sonnet 4.6. But that wasn't true for me with prior model generations, in those generations the Sonnet class models didn't feel good enough compared to the Opus class models. And it might shift again when I'm doing things that feel more intelligence bottlenecked.

But fast responses have an advantage of their own, they give you faster iteration. Kind of like how I used to like OpenAI Deep Research, but then switched to o3-thinking with web search enabled after that came out because it was 80% of the thoroughness with 20% of the time, which tended to be better overall.

nubg•24m ago
Interesting perspective. Perhaps also the user would adopt his queries knowing he can only to small (but very fast) steps. I wonder who would win!
ilaksh•1h ago
It seems like the chat demo is really suffering from the effect of everything going into a queue. You can't actually tell that it is fast at all. The latency is not good.

Assuming that's what is causing this. They might show some kind of feedback when it actually makes it out of the queue.

mhitza•30m ago
Comment retracted. My bad, missed some details.
selcuka•20m ago
I think your comment is a bit unfair.

> no reasoning comparison

Benchmarks against reasoning models:

https://www.inceptionlabs.ai/blog/introducing-mercury-2

> no demo

https://chat.inceptionlabs.ai/

> no info on numbers of parameters for the model

This is a closed model. Do other providers publish the number of parameters for their models?

> testimonials that don't actually read like something used in production

Fair point.

pants2•9m ago
Reading such obvious LLM-isms in the announcement just makes me cringe a bit too, ex.

> We optimize for speed users actually feel: responsiveness in the moments users experience — p95 latency under high concurrency, consistent turn-to-turn behavior, and stable throughput when systems get busy.

nylonstrung•21m ago
I'm not sold on diffusion models.

Other labs like Google have them but they have simply trailed the Pareto frontier for the vast majority of use cases

Here's more detail on how price/performance stacks up

https://artificialanalysis.ai/models/mercury-2

arjie•13m ago
Please pre-render your website on the server. Client-side JS means that my agent cannot read the press-release and that reduces the chance I am going to read it myself. Also, day one OpenRouter increases the chance that someone will try it.

I'm helping my dog vibe code games

https://www.calebleak.com/posts/dog-game/
612•cleak•8h ago•183 comments

Show HN: Moonshine Open-Weights STT models – higher accuracy than WhisperLargev3

https://github.com/moonshine-ai/moonshine
119•petewarden•3h ago•20 comments

Mac mini will be made at a new facility in Houston

https://www.apple.com/newsroom/2026/02/apple-accelerates-us-manufacturing-with-mac-mini-production/
332•haunter•4h ago•341 comments

Mercury 2: The fastest reasoning LLM, powered by diffusion

https://www.inceptionlabs.ai/blog/introducing-mercury-2
54•fittingopposite•3h ago•14 comments

Hacking an old Kindle to display bus arrival times

https://www.mariannefeng.com/portfolio/kindle/
169•mengchengfeng•6h ago•40 comments

Nearby Glasses

https://github.com/yjeanrenaud/yj_nearbyglasses
236•zingerlio•8h ago•89 comments

I pitched a roller coaster to Disneyland at age 10 in 1978

https://wordglyph.xyz/one-piece-at-a-time
390•wordglyph•12h ago•151 comments

Corgi Labs (YC W23) Is Hiring

https://www.ycombinator.com/companies/corgi-labs/jobs/ZiEIf7a-founders-associate
1•leastsquares•46m ago

Pi – a minimal terminal coding harness

https://pi.dev
150•kristianpaul•3h ago•65 comments

Show HN: Emdash – Open-source agentic development environment

https://github.com/generalaction/emdash
111•onecommit•7h ago•46 comments

Hugging Face Skills

https://github.com/huggingface/skills
135•armcat•8h ago•41 comments

Optophone

https://en.wikipedia.org/wiki/Optophone
32•Hooke•4d ago•8 comments

Aesthetics of single threading

https://ta.fo/aesthetics-of-single-threading/
15•todsacerdoti•3d ago•1 comments

How we rebuilt Next.js with AI in one week

https://blog.cloudflare.com/vinext/
369•ghostwriternr•5h ago•125 comments

IRS Tactics Against Meta Open a New Front in the Corporate Tax Fight

https://www.nytimes.com/2026/02/24/business/irs-meta-corporate-taxes.html
185•mitchbob•12h ago•196 comments

We Are Changing Our Developer Productivity Experiment Design

https://metr.org/blog/2026-02-24-uplift-update/
40•ej88•5h ago•27 comments

We installed a single turnstile to feel secure

https://idiallo.com/blog/installed-single-turnstile-for-security-theater
270•firefoxd•2d ago•123 comments

OpenAI, the US government and Persona built an identity surveillance machine

https://vmfunc.re/blog/persona/
441•rzk•7h ago•143 comments

Build Your Own Forth Interpreter

https://codingchallenges.fyi/challenges/challenge-forth/
48•AlexeyBrin•3d ago•15 comments

Cell Service for the Fairly Paranoid

https://www.cape.co/
64•0xWTF•3h ago•43 comments

Justifying Text-Wrap: Pretty

https://matklad.github.io/2026/02/14/justifying-text-wrap-pretty.html
7•surprisetalk•5d ago•0 comments

Steel Bank Common Lisp

https://www.sbcl.org/
152•tosh•7h ago•50 comments

The history of knocking on wood

https://resobscura.substack.com/p/neolithic-habits-machine-age-tools
11•benbreen•10h ago•1 comments

IDF killed Gaza aid workers at point blank range in 2025 massacre: Report

https://www.dropsitenews.com/p/israeli-soldiers-tel-sultan-gaza-red-crescent-civil-defense-massac...
1281•Qem•13h ago•511 comments

Stripe reportedly makes offer to acquire PayPal

https://www.cnbc.com/2026/02/24/paypal-stock-stripe-acquisition-report.html
62•nodesocket•3h ago•38 comments

Ask HN: Programmable Watches with WiFi?

17•dakiol•3d ago•8 comments

US Military leaders meet with Anthropic to argue against Claude safeguards

https://www.theguardian.com/us-news/2026/feb/24/anthropic-claude-military-ai
17•KnuthIsGod•1h ago•0 comments

Amazon Busted for Widespread Scheme to Inflate Prices Across the Economy

https://www.thebignewsletter.com/p/amazon-busted-for-widespread-price
13•toomuchtodo•46m ago•1 comments

The Missing Semester of Your CS Education – Revised for 2026

https://missing.csail.mit.edu/
400•anishathalye•1d ago•117 comments

Dream Recorder AI – a portal to your subconscious

https://dreamrecorder.ai/
15•level87•3h ago•11 comments