A 27M-param model that solves hard Sudoku/mazes where LLMs fail, without CoT

6•mingli_yuan•7h ago

Comments

mingli_yuan•7h ago

Hi HN,

We've seen LLMs struggle with complex, multi-step reasoning tasks. The common approach, Chain-of-Thought (CoT), often requires massive datasets, is brittle, and suffers from high latency.

To tackle this, we developed the Hierarchical Reasoning Model (HRM), a novel recurrent architecture inspired by how the human brain processes information across different timescales (as seen in the diagram on the left).

It's a small model that packs a huge punch. Here are the key highlights:

Extremely Lightweight: Only 27 million parameters.

Data Efficient: Trained with just 1000 samples for the complex tasks shown.

No Pre-training Needed: It works from scratch without needing massive pre-training or any CoT supervision data.

Single Forward Pass: It solves the entire reasoning task in one go, making it incredibly fast and efficient.

How It Works HRM consists of two interconnected recurrent modules that mimic brain-wave coupling:

High-level Module: Operates slowly, like the brain's Theta waves (θ, 4-8Hz), to handle abstract planning and goal setting.

Low-level Module: Operates quickly, like Gamma waves (γ, ~40Hz), to execute the fine-grained computational steps.

These two modules work together, allowing the model to achieve significant computational depth while remaining stable and efficient to train.

Astonishing Performance The results speak for themselves (see charts on the right). On tasks requiring complex, precise reasoning, HRM dramatically outperforms much larger models:

Extreme Sudoku (9x9): HRM achieves 55.0% accuracy. Other models, including direct prediction and larger LLMs like Claude 3.7 8K, score 0.0%.

Hard Maze (30x30): HRM finds the optimal path 74.5% of the time. Again, others score 0.0%.

ARC-AGI Benchmark: On the Abstraction and Reasoning Corpus (ARC), a key test for AGI capabilities, HRM significantly outperforms larger models with much longer context windows.

We believe HRM represents a transformative step towards more general and efficient reasoning systems. It shows that a carefully designed architecture can sometimes beat brute-force scale.

We'd love to hear your thoughts on this approach! What other applications could you see for a model like this?

Paper: https://arxiv.org/abs/2506.21734 Code: https://github.com/sapientinc/HRM

SmoothCSV: The CSV Editor

Ask HN: Can You Buy Your Way into Your Dream Job?

SWE-Bench Verified Is Flawed Despite Expert Review

Migrating to AWS in production with zero downtime

Show HN: Free crypto screener for Binance, Bybit, OKX and Coinbase (no login)

NPM 'Is' Package Hijacked in Expanding Supply Chain Attack

If Coding Agents Were Rappers

UFOs once took control of Russian ICBMs, nearly caused WW3 – testimony

Andrej Karpathy – The append-and-review note

Vibe Coding an SMTP Server, in Rust

Build, Learn, Delete, Repeat

Veles, Google's open source secret scanner

Show HN: Let ChatGPT Plus control any Python or JavaScript object in 3 lines

Leak: Anthropic Says the Company Will Pursue Gulf State Investments After All

Thursday Is Durable Computing Day

A Quick(ish) Introduction to Tuning Postgres

Ask HN: Can we better use heat from data centers?

Distribution Package vs. Import Package

Burning Man Festival Is Burning Through Cash

MCK: Open-Source MongoDB Operator

ΜFork: A pure actor-based concurrent machine architecture with memory-safety an

Study: How American Consumers Are Using AI

Why "How many tennis balls fit in a bus?" is a good interview question

Amazon buys Bee AI wearable that listens to everything you say

Inheritance over Composition, Sometimes

Show HN: Featurevisor v2.0 – declarative feature flags management with Git

Crowdfunding Success – Was it worth it?

Show HN: It's Like FIFA for Developers 1vs1 Code Battle

Why everyone is probably wrong about AI

Brave Browser Blocks Windows Recall