frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Sapients paper on the concept of Hierarchical Reasoning Model

https://arxiv.org/abs/2506.21734
48•hansmayer•3h ago

Comments

torginus•1h ago
Is it just me or are symbolic (or as I like to call it 'video game') AI is seeping back into AI?
taylorius•1h ago
Perhaps so - but represented in a trainable, neural form. Very exciting!
cs702•1h ago
Based on a quick first skim of the abstract and the introduction, the results from hierarchical reasoning (HRM) models look incredible:

> Using only 1,000 input-output examples, without pre-training or CoT supervision, HRM learns to solve problems that are intractable for even the most advanced LLMs. For example, it achieves near-perfect accuracy in complex Sudoku puzzles (Sudoku-Extreme Full) and optimal pathfinding in 30x30 mazes, where state-of-the-art CoT methods completely fail (0% accuracy). In the Abstraction and Reasoning Corpus (ARC) AGI Challenge 27,28,29 - a benchmark of inductive reasoning - HRM, trained from scratch with only the official dataset (~1000 examples), with only 27M parameters and a 30x30 grid context (900 tokens), achieves a performance of 40.3%, which substantially surpasses leading CoT-based models like o3-mini-high (34.5%) and Claude 3.7 8K context (21.2%), despite their considerably larger parameter sizes and context lengths, as shown in Figure 1.

I'm going to read this carefully, in its entirety.

Thank you for sharing it on HN!

diwank•1h ago
Exactly!

> It uses two interdependent recurrent modules: a *high-level module* for abstract, slow planning and a *low-level module* for rapid, detailed computations. This structure enables HRM to achieve significant computational depth while maintaining training stability and efficiency, even with minimal parameters (27 million) and small datasets (~1,000 examples).

> HRM outperforms state-of-the-art CoT models on challenging benchmarks like Sudoku-Extreme, Maze-Hard, and the Abstraction and Reasoning Corpus (ARC-AGI), where CoT methods fail entirely. For instance, it solves 96% of Sudoku puzzles and achieves 40.3% accuracy on ARC-AGI-2, surpassing larger models like Claude 3.7 and DeepSeek R1.

Erm what? How? Needs a computer and sitting down.

electroglyph•1h ago
but does it scale?
lispitillo•54m ago
I hope/fear this HRM model is going to be merged with MoE very soon. Given the huge economic pressure to develop powerful LLMs I think this can be done in just a month.

The paper seems to only study problems like sudoku solving, and not question answering or other applications of LLMs. Furthermore they omit a section for future applications or fusion with current LLMs.

I think anyone working in this field can envision their applications, but the details to have a MoE with an HRM model could be their next paper.

I only skimmed the paper and I am not an expert, sure other will/can explain why they don't discuss such a new structure. Anyway, my post is just blissful ignorance over the complexity involved and the impossible task to predict change.

Edit: A more general idea is that Mixture of Expert is related to cluster of concepts and now we would have to consider a cluster of concepts related by the time they take to be grasped, so in a sense the model would have in latent space an estimation of the depth, number of layers, and time required for each concept, just like we adapt our reading style for a dense math book different to a newspaper short story.

buster•33m ago
must say I am suspicious in this regard, as they don't show applications other than a Sudoku solver and don't discuss downsides.

Analoguediehard

http://www.analoguediehard.com/
1•gregsadetsky•3m ago•0 comments

Has the Russian intelligence service penetrated Telegram?

https://www.perplexity.ai/search/is-there-evidence-that-suggest-FMgkZrx3SHONR2v1wSC.zg
3•lamg•8m ago•0 comments

SharePoint Exploit Intelligence with Honeypots

https://defusedcyber.com/sharepoint-exploit-intelligence-with-honeypots
1•waihtis•13m ago•0 comments

How to build the Stasheff Associahedron out of a trefoil knot

https://francisrlb.com/2025/07/27/how-to-build-the-stasheff-associahedron-out-of-a-trefoil-knot/
2•mathgenius•22m ago•0 comments

Show HN: A Modular Phoenix SaaS Starter Kit

https://www.phoenixsaaskit.com/
1•bustylasercanon•25m ago•0 comments

The U.S. Central Intelligence Agency's (CIA) remote viewing experiments

https://pmc.ncbi.nlm.nih.gov/articles/PMC10275521/
1•handfuloflight•26m ago•1 comments

What if sailing had no rules? [video]

https://www.youtube.com/watch?v=kk4AV3d4v3E
1•zeristor•29m ago•0 comments

LLMs are bad at returning code in JSON

https://aider.chat/2024/08/14/code-in-json.html
2•pcwelder•33m ago•1 comments

Conspiracy theorists think their views are mainstream

https://arstechnica.com/civis/threads/conspiracy-theorists-think-their-views-are-mainstream.1508474/page-4
3•Bluestein•35m ago•1 comments

US drops sanctions on Myanmar junta's allies after military chief praises man

https://www.abc.net.au/news/2025-07-26/us-drops-sanctions-on-myanmar-junta-allies-after-trump-praise/105576812
3•KnuthIsGod•36m ago•0 comments

Show HN: Mapping supply chain of products (updated)

https://www.beneluxmanufacturing.com/supply-chain-explorer/
2•nodezero•39m ago•0 comments

The Internet Archive just became an official U.S. federal library

https://mashable.com/article/internet-archive
1•taubek•41m ago•1 comments

Astronomer's 'clever' PR move embracing CEO scandal – featuring Gwyneth Paltrow

https://www.bbc.co.uk/news/articles/crlzrjp2e2lo
3•mellosouls•41m ago•0 comments

djbwares version 10

https://jdebp.uk/Softwares/djbwares/
2•JdeBP•43m ago•0 comments

Exploring Windows XP on macOS ARM64

https://milen.me/writings/exploring-windows-xp-on-macos-arm64/
1•dsego•44m ago•0 comments

Careers at the Frontier: Hiring the Future at OpenAI

https://forum.openai.com/public/videos/event-replay-careers-at-the-frontier-hiring-the-future-at-openai
1•hunglee2•51m ago•0 comments

JANET – The UK Joint Academic Network (1988) [pdf]

https://serials.uksg.org/articles/35/files/submission/proof/35-1-35-1-10-20150210.pdf
1•dcminter•55m ago•1 comments

Constrained languages are easier to optimize

https://jyn.dev/constrained-languages-are-easier-to-optimize/
2•PaulHoule•56m ago•0 comments

Using Codex-CLI with ChatGPT Plus/Pro

https://github.com/openai/codex/issues/35
1•tosh•57m ago•0 comments

Draw a fish and watch it swim

https://drawafish.com
1•thunderbong•1h ago•0 comments

Has the Qianfan satellite network – China's Starlink rival – run into trouble?

https://www.scmp.com/news/china/science/article/3319163/has-qianfan-satellite-network-chinas-starlink-rival-run-trouble
2•jnord•1h ago•1 comments

Scala Highlights June 2025 – Scala 3.9 will be the new LTS

https://www.scala-lang.org/highlights/2025/06/26/highlights-june-2025.html#scala-39-will-be-the-new-lts
1•TheWiggles•1h ago•0 comments

Ronald Coase (1960) – The Problem of Social Cost [pdf]

https://www.law.uchicago.edu/sites/default/files/file/coase-problem.pdf
2•mobileturdfctry•1h ago•1 comments

Apocalyptica

https://medium.com/luminasticity/apocalyptica-3e8e58f84891
1•bryanrasmussen•1h ago•0 comments

Four ways of declaring interfaces in Haskell

http://marcosh.github.io/post/2025/07/22/four-ways-of-declaring-interfaces-in-haskell.html
2•yehoshuapw•1h ago•0 comments

Who Likes Authoritarianism?

https://www.pewresearch.org/short-reads/2024/02/28/who-likes-authoritarianism-and-how-do-they-want-to-change-their-government/
3•GreenSalem•1h ago•4 comments

Libu8ident: Unicode security guidelines for programming language identifiers

https://github.com/rurban/libu8ident
1•fanf2•1h ago•0 comments

Emad Mostaque: The Plan to Save Humanity from AI [video]

https://www.youtube.com/watch?v=fxmXYfHTCwU
1•hunglee2•1h ago•0 comments

Thank You for Finding Me

https://longreads.com/2025/07/24/thank-you-for-finding-me/
3•homarp•1h ago•1 comments

Making Cloudflare Pages Faster with Cloudflare's CDN

https://userbird.com/blog/cache-everything-cloudflare-cdn
1•chrism2671•1h ago•1 comments