frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

AI agents that argue with each other to improve decisions

https://github.com/rockcat/HATS
18•rockcat12•3h ago

Comments

oldsecondhand•1h ago
Sounds like a less efficient version of the mixture of experts approach.
gavmor•1h ago
How does mixture of experts architecture work? Are they debating, or merely delegating?

From what I've read, for each token or input patch, the gate computes a set of probabilities (or scores) over the experts, then selects a small subset (often the top‑[k]) and routes that input only to those.

Ie each expert computes its own transformation on the same original input (or a shared intermediate representation), and then their outputs are combined at the next layer via the gate’s weights.

That’s post hoc combination, not B reasoning over A’s reasoning.

AntiUSAbah•20m ago
A MoE model is one model with expert parts which use less tokens. Which makes it easier for an expert to diverge to a better optimum state. Its easier to only need to know medicin instead of everything and being able to separate everything away from medicin even if certain names, concepts etc. are the same.

AI agents discussing things with each others would be more like one thinking model thinking throught the problem with different personas.

With different underlying models, you can leverage the best model for one persona. Like people said before (6 month ago, no clue if this is still valid) that they prefer GPT for planning and Claude for executing / coding.

zby•1h ago
I don't know - looks like an interesting idea - but ... I am struggling to put that in a polite manner. When I go into the repo and find out that it does stuff like lip syncing of talking avatars then I start to think what percentage of the development effort goes into marketing?
gertlabs•1h ago
Self organizing systems is an area of research to which I think LLMs will contribute immensely.

But as of now, even newer AI models are not particularly insightful. I'm always surprised by how suboptimal near-frontier LLMs are at collaborating in some of the easier cooperative environments on my benchmarking and RL platform. For example, check out a replay of consensus grid here: https://gertlabs.com/spectate

AntiUSAbah•1h ago
While interesting, its not clear to me with just looking at concensus grid how they are prompted.

Do you tell them to think and coordinate the next step through some type of sync/talking mechanism or is it turn by turn?

I suspect turn by turn as it is similiar to other experiements and in this case, it wouldn't work because they wouldn't have a certain amount of time to think about the next step together?

gertlabs•1h ago
All of our environments are tick based (with ticks of varying speeds), and this is explained in the prompt given to the models, along with the latest observation and a history of recent events/conversations/actions.

So that does make the game more challenging, versus some other simulations we have where multiple conversation turns happen before action. But the inefficiencies I'm describing are different; for example, an agent reaches part of the destination area but is clearly blocking another player who needs to pass, and most models will just stay put instead of moving along to another target spot.

AntiUSAbah•19m ago
So is "Game Overview" the prompt? Because i can't seem to see any indication / hint given to the models that its a game they should work together on and commmunicate etc.
gertlabs•7m ago
No the full prompt is not available in the UI, sorry.
ChadMoran•1h ago
I've been doing this with Claude Code and agent teams.

I have a /red-team skill that will use an agent team to criticize it's own work, grade and rank feedback, incorporate relevant feedback and then start over. It has increased the quality of output.

submeta•21m ago
I had good results with combining Claude Code with Codex, let them have back and forth sessions. Their prompts were magnitudes better than mine, also their evaluation and criticism of the other LLM

What I haven’t taken time for is finding out about how I‘d automate their back-and-forth and stop manually copy/pasting their responses.

Can you stop beans from making you gassy?

https://www.seriouseats.com/how-to-reduce-bean-gas-tested-11883862
52•jstrieb•1h ago•23 comments

The Free Universal Construction Kit

https://fffff.at/free-universal-construction-kit/
194•robinhouston•3d ago•39 comments

1-Bit Hokusai's "The Great Wave" (2023)

https://www.hypertalking.com/2023/05/08/1-bit-pixel-art-of-hokusais-the-great-wave-off-kanagawa/
476•stephen-hill•3d ago•83 comments

Using coding assistance tools to revive projects you never were going to finish

https://blog.matthewbrunelle.com/its-ok-to-use-coding-assistance-tools-to-revive-the-projects-you...
98•speckx•5h ago•54 comments

New 10 GbE USB adapters are cooler, smaller, cheaper

https://www.jeffgeerling.com/blog/2026/new-10-gbe-usb-adapters-cooler-smaller-cheaper/
507•calcifer•16h ago•300 comments

Simulacrum of Knowledge Work

https://blog.happyfellow.dev/simulacrum-of-knowledge-work/
48•thehappyfellow•4h ago•20 comments

Mine, an IDE for Coalton and Common Lisp

https://coalton-lang.github.io/mine/
35•varjag•4h ago•2 comments

Desmond Morris has died

https://www.bbc.com/news/articles/c51y797v200o
74•martey•5d ago•13 comments

Martin Galway's music source files from 1980's Commodore 64 games

https://github.com/MartinGalway/C64_music
149•ingve•11h ago•18 comments

GPT‑5.5 Bio Bug Bounty

https://openai.com/index/gpt-5-5-bio-bug-bounty/
110•Murfalo•7h ago•87 comments

Discret 11, the French TV encryption of the 80s

https://fabiensanglard.net/discret11/
130•adunk•10h ago•20 comments

Lute: A Standalone Runtime for Luau

https://lute.luau.org/
36•vrn-sn•2d ago•7 comments

The AI Industry Is Discovering That the Public Hates It

https://newrepublic.com/article/209163/ai-industry-discovering-public-backlash
113•chirau•52m ago•111 comments

How Hard Is It to Open a File?

https://blog.sebastianwick.net/posts/how-hard-is-it-to-open-a-file/
19•ffin•1d ago•2 comments

Which one is more important: more parameters or more computation? (2021)

https://parl.ai/projects/params_vs_compute/
39•jxmorris12•1d ago•5 comments

What async promised and what it delivered

https://causality.blog/essays/what-async-promised/
108•zdw•3d ago•104 comments

Hokusai and Tesselations

https://dl.ndl.go.jp/pid/1899550/1/11/
78•srean•4h ago•13 comments

Show HN: Kloak, A secret manager that keeps K8s workload away from secrets

https://getkloak.io/
22•neo2006•3h ago•15 comments

The Joy of Folding Bikes

https://blog.korny.info/2026/04/19/the-joy-of-folding-bikes
6•pavel_lishin•3d ago•0 comments

America's Geothermal Breakthrough Could Unlock a 150-Gigawatt Energy Revolution

https://oilprice.com/Alternative-Energy/Geothermal-Energy/Americas-Geothermal-Breakthrough-Could-...
30•sleepyguy•2h ago•13 comments

Insights into firewood use by early Middle Pleistocene hominins

https://www.sciencedirect.com/science/article/pii/S0277379126001824
43•wslh•3d ago•17 comments

A web-based RDP client built with Go WebAssembly and grdp

https://github.com/nakagami/grdpwasm
97•mariuz•11h ago•39 comments

Only one side will be the true successor to MS-DOS – Windows 2.x

https://blisscast.wordpress.com/2026/04/21/windows-2-gui-wonderland-12a/
65•keepamovin•11h ago•47 comments

North American Millets Alliance(2023)

https://milletsalliance.org/
8•num42•4h ago•2 comments

Plain text has been around for decades and it’s here to stay

https://unsung.aresluna.org/plain-text-has-been-around-for-decades-and-its-here-to-stay/
254•rbanffy•21h ago•127 comments

Replace IBM Quantum back end with /dev/urandom

https://github.com/yuvadm/quantumslop/blob/25ad2e76ae58baa96f6219742459407db9dd17f5/URANDOM_DEMO.md
316•pigeons•21h ago•44 comments

HEALPix

https://en.wikipedia.org/wiki/HEALPix
46•hyperific•8h ago•6 comments

Lambda Calculus Benchmark for AI

https://victortaelin.github.io/lambench/
119•marvinborner•10h ago•36 comments

Commenting and approving pull requests

https://www.jakeworth.com/posts/on-commenting-and-approving-pull-requests/
72•jwworth•2d ago•61 comments

Sabotaging projects by overthinking, scope creep, and structural diffing

https://kevinlynagh.com/newsletter/2026_04_overthinking/
506•alcazar•1d ago•129 comments