frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Lightless Labs Refinery – multi-model consensus and synthesis

https://github.com/Lightless-Labs/refinery
2•ElFitz•1d ago
Hi!

In the past few weeks I (mostly Claude) cobbled together a Rust library + cli to run the same prompt across multiple models, through multiple rounds of iterative consensus.

Each model is fed the same initial prompt, produces an answer, then every model individually reviews and scores each of the other model's answers independently. The original prompt, previous answer, and the reviews, are then fed back to the models for the next round, until either one model "wins" two rounds in a row or a limit is reached.

It did quite well on the car wash test (https://github.com/Lightless-Labs/refinery?tab=readme-ov-fil...). Most models answer badly initially, but it just takes one for all of them to quickly converge towards better answers. Although, to my initial surprise, adding more models quickly breaks the current voting+threshold selection strategy.

I also recently added a synthesis mode, which does the same thing but with an additional synthesis round at the end where each model produces a synthesis of all the answers that scored above the threshold in the last round, followed by one last review round.

The total number of calls quickly blows up with rounds and model count, but it's been fun!

Currently, I'm racking my brain trying to figure out a way to select for both diversity and quality, for a "brainstorm" process. If you have any ideas either on that or other features, let me know!

Comments

ad-tech•1d ago
The voting thing breaks because youre treating all models equally when they shouldnt be. We ran consensus logic like this on a smaller scale and quickly realized throwing 5 mediocre models at a problem just makes them argue in circle. One good model beats three bad ones always. The synthesis round will get expensive fast too - we started with 2 models doing 3 rounds and it was already costing 40x a single pass. For brainstorm mode maybe weight models by past accuracy instead of pure voting? We do this with our team internally - the person who got it right last time gets listened to more next time, not equal voice to everyone. Could be interesting to test.
ElFitz•1d ago
Why would the synthesis round get expensive than the regular rounds?

> and quickly realized throwing 5 mediocre models at a problem just makes them argue in circle.

What was your selection strategy? My current issue is more that the more models I add, the less likely any specific one is to win two rounds in a row. Which would make perfect sense no matter the model quality, no? Unless there’s a huge gap.

> For brainstorm mode maybe weight models by past accuracy instead of pure voting?

By adding outputs history and a way to track the actual outcomes?

Show HN: AINL – Compile AI agent workflows to deterministic graphs

https://github.com/sbhooley/ainativelang
1•sbhooley•34s ago•0 comments

Good News: Free Speech Wins Big in Court

https://www.racket.news/p/finally-good-news-free-speech-wins
1•mudil•2m ago•0 comments

AI Won't Automatically Accelerate Clinical Trials

https://www.asimov.press/p/ai-clinical-trials
1•surprisetalk•3m ago•0 comments

Dreaming of a Ten-Year Computer

https://alexwlchan.net/2026/ten-year-computer/
1•surprisetalk•4m ago•0 comments

China Is Not an Expansionist Power

https://zixuanma.blog/p/china-is-not-an-expansionist-power
2•surprisetalk•4m ago•0 comments

Principles and Gear

https://arun.is/blog/on-running/
1•surprisetalk•4m ago•0 comments

Battleship Prompts

https://jonathannen.com/battleship-prompts/
2•jwilliams•6m ago•0 comments

KDE Plasma 6.6 Delivers an Impressive Edge over Gnome 50 on Ubuntu 26.04

https://www.phoronix.com/review/ubuntu-2604-gnome-kde
2•jrepinc•6m ago•0 comments

ClawInstitute

https://clawinstitute.aiscientist.tools
2•Murfalo•7m ago•1 comments

Show HN: Kora – An AI-native OS layer written in 370k lines of Rust

https://intuitivecompute.com
2•jwatters•8m ago•0 comments

Next.js Across Platforms: Adapters, OpenNext, and Our Commitments

https://nextjs.org/blog/nextjs-across-platforms
2•makepanic•8m ago•0 comments

Aerion – An Open Source Lightweight Email Client

https://github.com/hkdb/aerion
2•thdr•8m ago•0 comments

Iran war could crimp Gulf allies' US investments

https://www.politico.com/news/2026/03/26/immensely-destabilizing-iran-war-threatens-gulfs-us-inve...
3•rurp•8m ago•0 comments

The RISE RISC-V Runners: free, native RISC-V CI on GitHub

https://riseproject.dev/2026/03/24/announcing-the-rise-risc-v-runners-free-native-risc-v-ci-on-gi...
2•thebeardisred•9m ago•0 comments

Why aren't we fine-tuning more?

https://www.natemeyvis.com/why-arent-we-fine-tuning-more/
2•gmays•9m ago•0 comments

AMD Announces the Ryzen 9 9950X3D2

https://www.phoronix.com/news/AMD-Ryzen-9-9950X3D2
3•coobird•10m ago•0 comments

Hello Algo

https://www.hello-algo.com/en/
2•ibobev•11m ago•0 comments

Show HN: Wit – Stops merge conflicts when multiple AI agents edit the same repo

https://github.com/amaar-mc/wit
4•amaarc•12m ago•0 comments

ZT Manager – A native iOS app to manage ZeroTier networks

https://testflight.apple.com/join/Xvd715tV
3•Messoris•12m ago•1 comments

Flowers for dry Claude: Memes are better sensors than benchmarks

https://www.nickoak.com/posts/flowers-for-dry-claude/
2•buildoak•13m ago•0 comments

Sorting Algorithms

https://tools.simonwillison.net/sort-algorithms
2•cromulent•17m ago•0 comments

Speaking of Voxtral

https://mistral.ai/news/voxtral-tts
3•Palmik•18m ago•0 comments

Federal government employees are not ok

https://donmoynihan.substack.com/p/federal-employees-are-not-ok
4•NomNew•18m ago•1 comments

FossGIS Videos (mostly in German language)

https://media.ccc.de/c/fossgis2026
2•slow_typist•19m ago•0 comments

Show HN: Search and track flight prices across date and destination combinations

https://butterfly.flights/
4•philjohnson•20m ago•0 comments

Oldest dog identified at ancient hunter-gatherer site

https://www.science.org/content/article/world-s-oldest-dog-identified-ancient-hunter-gatherer-site
2•Brajeshwar•20m ago•0 comments

Show HN: I resurrected my 2013 web usability checklist for the AI age

https://www.userium.com/
2•userium•21m ago•1 comments

French e, è, é, ê, ë – what's the difference?

https://jakubmarian.com/french-e-e-e-e-e-whats-the-difference/
6•kerblang•21m ago•0 comments

S.F. school board restores 8th-grade algebra after 12-year hiatus

https://missionlocal.org/2026/03/san-francisco-algebra-middle-school/
2•mikhael•21m ago•0 comments

Cactus, a work-stealing parallel recursion runtime for C

https://github.com/xtellect/cactus
2•enduku•22m ago•1 comments