frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Openrouter Fusion API

https://openrouter.ai/openrouter/fusion
48•tdchaitanya•4h ago

Comments

andai•1h ago
Context:

Surpassing Frontier Performance with Fusion

https://news.ycombinator.com/item?id=48525392

And a slightly better UI here: https://openrouter.ai/fusion

On OpenRouter's fusion API your request is routed to several models simultaneously and a judge model combines their answers into a final response. This significantly boosts performance, at the cost of time (at least on the one benchmark they tested, a deep research benchmark).

They have a Budget preset consisting of 3 cheaper models (which roughly matches Fable on that benchmark, costing half as much), and a Quality preset of 3 expensive ones (which beats Fable, but costs twice as much as Fable).

Pareto graph: https://openrouter.ai/blog/images/blog/fusion-benchmark-cost...

Curiously, fusing a model with itself also boosted performance (2xOpus4.8 roughly matching Fable on the benchmark, but costing twice as much as Fable). There's a further, smaller gain from mixing different models. The main gain seems to be from additional test time compute.

Would love to see more research on this, especially focusing on the cheap models that came out recently (e.g. Fusing DSV4 with itself, or with Mimo), and to see what the tradeoffs look like between running a fusion (parallel test time compute) vs increased reasoning or turns.

sigmoid10•1h ago
Interesting how well a panel of Fable 5 + GPT 5.5 beats the frontier of either one, but if you add Gemini into the mix the panel of three performs worse, not better. To me that sounds like Gemini is worse at the given tasks but better at convincing judges of its solutions. Oh and a Panel of 2 Opus 4.8 models is almost exactly as good as one Fable 5. That smells suspicious. Do we know if that might simply be what Anthropic is doing behind the curtain?
qsort•55m ago
Yeah, GPT 5.5 + Fable beating either individually is belivable, but 2x Opus > Fable is what makes me a bit dubious about the whole thing. They might be measuring skills that are too specific or benefit a lot from more tokens being thrown at them. Also Claude Code (the harness) is not the best at the moment, that might be part of it as well?
waysa•44m ago
> Oh and a Panel of 2 Opus 4.8 models is almost exactly as good as one Fable 5. That smells suspicious. Do we know if that might simply be what Anthropic is doing behind the curtain?

I wouldn't be surprised if Fable/Mythos is a model distilled from a Panel/Council of Claude instances. Recursive self improvement is something all AI labs must be working on in some way or another.

jorvi•1h ago
I don't know if it is still the case with current models, but a few generations back Microsoft had some research results where asking a model to iterate N times would significantly improve performance, with the optimal point being 4 iterations.
Garlef•
Havoc•52m ago
Interesting. Will definitely use this.

One scenario I can see it working is writing markdown specs before the coding starts and analysing it for gaps. That’s so few tokens that throwing as much LLM against it as possible is worthwhile regardless of cost per million tks

egeres•48m ago
I wonder if these fusion techniques could help to run better local AI by streaming tokens from multiple machines and combining them
michaelbuckbee•38m ago
I ran a quick eval to see what this looks like qualitatively vs just calling Opus 4.7 or GPT 5.5 directly.

As expected, Fusion was 7x slower and 4x the cost.

This isn't a knock against it, just that it I think this places Fusion into a "use it only when you need it" category.

https://3fpi5avcqq.evvl.io/

IanCal•22m ago
Which models were you using under this? If you used the quality default as exists in the interface, it makes sense that it was ~4x the cost as it'd be 3 frontier models judged by one of those.

The idea would be to use fusion with simpler, cheaper models.

eknkc•23m ago
I opened the page and prompted it `Which 3d printer is the best`. I mean this is a stupid question but I was looking at some 3d printers so it popped into my mind.

Seeing this log is interesting: https://link.ekin.dev/6RzYGGX7

It came up with a decent response but I guess Opus or GPT 5.5 would do fine anyway. Gotta try it on different stuff. But this feels like it would work great on some situations.

bushido•17m ago
Interestingly I've had a similar experience with agent teams/swarms, albeit they can get much more expensive depending on the workflow.

I found that Fable didn't have as much of an impact when put in a team.

But it was/is a very pleasant model to work with 1:1. And was the first time I didn't use my primary team based workhorse in months, across 10s of sessions last week.

dsl•6m ago
Heh. I built "Fusion" a few months ago as an MCP using OpenRouter. The idea was to give Claude a "panel of experts" to go talk to when it got stuck.

After extensive testing and benchmarking I discovered that when you ask one model to judge another's response you don't actually get a better answer. You are just asking it "how closely does this resemble the answer you would have given me." Additional rounds and all the "obvious" solutions that pop into your mind reading the proceeding sentence are essentially just cranking up the temperature.

I did find a solution, but it is insanely expensive. Maybe if this gains traction I'll release mine.

arizen•6m ago
Some anecdata on Fusion: I run same query I used for Fable on OR Fusion and results were worse.

It felt, like Fable was able to kinda grasp very deep knowledge/intelligence layers and outline solution not only in agreeable way, but rather it proposed to prioritize solution items, with discarding some of the items, which made a lot of sense to me.

While Fusion felt more like a bit diversified answer of the same class of pre-Fable SOTA models, without touching the depth of knowledge/intelligence layers, which Fable was able to get, in my very limited tests I did, while Fable was accessible.

52m ago
> but a few generations back

Out of interest: Was this still before CoT/thinking-mode became the norm?

wongarsu•7m ago
> Curiously, fusing a model with itself also boosted performance

Back in the GPT2 to GPT3 era this was a pretty common thing to do. You are effectively taking more samples from the space of likely outputs. If your model can do the task 60% of the time just take 5-10 samples and implement some kind of majority voting

It became less common to use as models got high accuracy on problems where combining results is trivial. But with a more complex judge (a competent LLM) you can still get better results by just sampling more of the output space and picking out the best aspects

Anthropic's Safety Superpower

https://stratechery.com/2026/anthropics-safety-superpower/
35•swolpers•1h ago•12 comments

Your ePub Is fine

https://andreklein.net/your-epub-is-fine-kobo-disagrees-blame-adobe/
658•sohkamyung•12h ago•217 comments

Apple Foundation Models

https://platform.claude.com/docs/en/cli-sdks-libraries/libraries/apple-foundation-models
194•MehrdadKhnzd•6h ago•75 comments

What the Fuck Happened to Nerds

https://mrmarket.lol/what-the-fuck-happened-to-nerds/
249•vrnvu•3h ago•168 comments

Foreign business owners are scrambling to raise capital to stay in Japan

https://tokyopaladin.substack.com/p/foreign-business-owners-are-scrambling
41•zdw•3d ago•8 comments

Even more batteries included with Emacs

https://karthinks.com/software/even-more-batteries-included-with-emacs/
228•signa11•8h ago•52 comments

Openrouter Fusion API

https://openrouter.ai/openrouter/fusion
50•tdchaitanya•4h ago•17 comments

Curl will not accept vulnerability reports during July 2026

https://daniel.haxx.se/blog/2026/06/15/curl-summer-of-bliss/
467•secret-noun•5h ago•191 comments

Show HN: Kage – Shadow any website to a single binary for offline viewing

https://github.com/tamnd/kage
580•tamnd•18h ago•115 comments

There Is(Ǝ) – Such That (∋)

https://www.fractalkitty.com/there-is-3-such-that/
53•evakhoury•3d ago•18 comments

Bitsy

https://bitsy.org/
214•tosh•3d ago•6 comments

Firewood Splitting Simulator

https://screen.toys/firewood/
845•memalign•5d ago•248 comments

Ported my C game to WASM, here's everybug that I hit

http://ernesernesto.github.io/writes/portingmatchmorphosistowasm/
9•birdculture•2d ago•1 comments

Being an old school web-based sports sim dev in the era of vibe coded games

https://zengm.com/blog/2026/06/vibecoded-games/
23•YesBox•2d ago•15 comments

Dalus (YC W25) Is Hiring a Senior Software Engineer in Germany

https://www.ycombinator.com/companies/dalus/jobs/5IDmKJt-senior-software-frontend-engineer-german...
1•sebastianvoelkl•4h ago

21 years and counting of 'eight fallacies of distributed computing' (2025)

https://blog.apnic.net/2025/12/08/21-years-and-counting-of-eight-fallacies-of-distributed-computing/
98•teleforce•11h ago•22 comments

Exploring building a tiny FUSE filesystem

https://www.shayon.dev/post/2026/161/building-a-tiny-fuse-filesystem/
22•shayonj•2d ago•4 comments

Why does paper fold so well?

https://www.bbc.co.uk/programmes/w3ct8k70
56•zeristor•1d ago•23 comments

Rio de Janeiro's "homegrown" LLM appears to be a merge of an existing model

https://github.com/nex-agi/Nex-N2/issues/4
360•unrvl22•19h ago•193 comments

Anthropic flies staff to D.C. to clean up White House fight

https://www.axios.com/2026/06/14/anthropic-white-house-mythos-fable
34•dstala•2h ago•33 comments

A short history of Cerro Torre, the most controversial mountain (2012)

https://www.markhorrell.com/blog/2012/a-short-history-of-cerro-torre/
48•joebig•4d ago•19 comments

Ask HN: What are you working on? (June 2026)

250•david927•19h ago•900 comments

Formal methods and the future of programming

https://blog.janestreet.com/formal-methods-at-jane-street-index/?from_theconsensus=1
281•eatonphil•22h ago•95 comments

Show HN: Trace – Offline Mac meeting transcripts you can flag mid-call

https://traceapp.info
171•AG342•1d ago•63 comments

Windows 11 users are tired of MS account requirements creeping into everything

https://www.windowscentral.com/microsoft/windows-11/windows-11-users-are-tired-of-microsoft-accou...
387•josephcsible•13h ago•262 comments

Chaosnet (1981)

https://tumbleweed.nu/r/lm-3/uv/amber.html
90•RGBCube•16h ago•12 comments

The only scalable delete in Postgres is DROP TABLE

https://planetscale.com/blog/the-only-scalable-delete
172•hollylawly•3d ago•65 comments

TorchCodec 0.14: HDR Video Decoding for CPU and CUDA, and Fast Wav Decoder

https://github.com/meta-pytorch/torchcodec/releases/tag/v0.14.0
50•scott_s•4d ago•5 comments

Perlisisms (1982)

https://www.cs.yale.edu/homes/perlis-alan/quotes.html
121•tosh•20h ago•58 comments

Caddy compatibility for zeroserve: 3x throughput and 70% lower latency

https://su3.io/posts/zeroserve-caddy-compat
192•losfair•21h ago•57 comments