frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

SmoothCSV: The CSV Editor

https://smoothcsv.com
1•msephton•16s ago•1 comments

Ask HN: Can You Buy Your Way into Your Dream Job?

1•YoloVibes•2m ago•0 comments

SWE-Bench Verified Is Flawed Despite Expert Review

https://ddkang.substack.com/p/swe-bench-verified-is-flawed-despite
2•yuxuan18•4m ago•0 comments

Migrating to AWS in production with zero downtime

https://loops.so/engineering-blogs/migrating-to-aws-in-production-with-zero-downtime
1•chrisfrantz•4m ago•1 comments

Show HN: Free crypto screener for Binance, Bybit, OKX and Coinbase (no login)

https://devisecrypto.com
1•zainwah24•4m ago•0 comments

NPM 'Is' Package Hijacked in Expanding Supply Chain Attack

https://socket.dev/blog/npm-is-package-hijacked-in-expanding-supply-chain-attack
1•feross•7m ago•0 comments

If Coding Agents Were Rappers

https://install.md/blog/if-coding-agents-were-rappers
1•goroutines•9m ago•0 comments

UFOs once took control of Russian ICBMs, nearly caused WW3 – testimony

https://www.jpost.com/omg/article-753288
2•handfuloflight•9m ago•0 comments

Andrej Karpathy – The append-and-review note

https://karpathy.bearblog.dev/the-append-and-review-note/
1•superconduct123•11m ago•0 comments

Vibe Coding an SMTP Server, in Rust

https://mailpace.com/blog/musings/vibe-coding-an-smtp-server
2•albertgoeswoof•13m ago•0 comments

Build, Learn, Delete, Repeat

https://ymichael.com/2025/07/18/build-learn-delete-repeat.html
1•sawyerjhood•13m ago•0 comments

Veles, Google's open source secret scanner

https://opensource.googleblog.com/2025/07/stop-leaked-credentials-in-their-tracks-with-veles-our-new-open-source-secret-scanner.html
1•sharkbot•14m ago•0 comments

Show HN: Let ChatGPT Plus control any Python or JavaScript object in 3 lines

https://chatgpt.com/g/g-4ioOcgdvH-telekinesis
1•eneuman•16m ago•0 comments

Leak: Anthropic Says the Company Will Pursue Gulf State Investments After All

https://www.wired.com/story/anthropic-dario-amodei-gulf-state-leaked-memo/
3•alexcos•16m ago•1 comments

Thursday Is Durable Computing Day

2•jedberg•19m ago•1 comments

A Quick(ish) Introduction to Tuning Postgres

https://byteofdev.com/posts/tuning-postgres-intro/
2•AsyncBanana•19m ago•0 comments

Ask HN: Can we better use heat from data centers?

2•mclau157•19m ago•1 comments

Distribution Package vs. Import Package

https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/
1•Bluestein•21m ago•0 comments

Burning Man Festival Is Burning Through Cash

https://www.bloomberg.com/news/features/2025-07-22/burning-man-festival-struggles-to-make-enough-money
3•petethomas•24m ago•0 comments

MCK: Open-Source MongoDB Operator

https://github.com/mongodb/mongodb-kubernetes
2•mmoogle•25m ago•0 comments

ΜFork: A pure actor-based concurrent machine architecture with memory-safety an

https://ufork.org/
1•fanf2•27m ago•0 comments

Study: How American Consumers Are Using AI

https://www.joeyoungblood.com/artificial-intelligence/study-how-american-consumers-are-using-ai/
1•bhartzer•29m ago•0 comments

Why "How many tennis balls fit in a bus?" is a good interview question

https://medium.com/@orzel.jarek/how-many-tennis-balls-fit-in-a-bus-why-weird-interview-questions-sometimes-make-sense-ec24f6aeec4e
6•saucetest•29m ago•2 comments

Amazon buys Bee AI wearable that listens to everything you say

https://www.theverge.com/news/711621/amazon-bee-ai-wearable-acquisition
2•swyx•31m ago•0 comments

Inheritance over Composition, Sometimes

https://death.andgravity.com/over-composition
1•BerislavLopac•34m ago•0 comments

Show HN: Featurevisor v2.0 – declarative feature flags management with Git

https://featurevisor.com/?v2
2•fahad19•35m ago•0 comments

Crowdfunding Success – Was it worth it?

https://atomic14.substack.com/p/crowdfunding-success-was-it-worth
2•iamflimflam1•36m ago•0 comments

Show HN: It's Like FIFA for Developers 1vs1 Code Battle

https://battlegpt.website
1•roozka10•38m ago•0 comments

Why everyone is probably wrong about AI

https://greyenlightenment.com/2025/07/08/dwarkesh-patel-on-agi-separating-ai-hype-from-reality/
3•paulpauper•40m ago•0 comments

Brave Browser Blocks Windows Recall

https://www.neowin.net/news/brave-browser-blocks-windows-feature-that-takes-screenshots-of-everything-you-do-on-your-pc/
1•bundie•41m ago•0 comments
Open in hackernews

A 27M-param model that solves hard Sudoku/mazes where LLMs fail, without CoT

https://github.com/sapientinc/HRM
6•mingli_yuan•7h ago

Comments

mingli_yuan•7h ago
Hi HN,

We've seen LLMs struggle with complex, multi-step reasoning tasks. The common approach, Chain-of-Thought (CoT), often requires massive datasets, is brittle, and suffers from high latency.

To tackle this, we developed the Hierarchical Reasoning Model (HRM), a novel recurrent architecture inspired by how the human brain processes information across different timescales (as seen in the diagram on the left).

It's a small model that packs a huge punch. Here are the key highlights:

Extremely Lightweight: Only 27 million parameters.

Data Efficient: Trained with just 1000 samples for the complex tasks shown.

No Pre-training Needed: It works from scratch without needing massive pre-training or any CoT supervision data.

Single Forward Pass: It solves the entire reasoning task in one go, making it incredibly fast and efficient.

How It Works HRM consists of two interconnected recurrent modules that mimic brain-wave coupling:

High-level Module: Operates slowly, like the brain's Theta waves (θ, 4-8Hz), to handle abstract planning and goal setting.

Low-level Module: Operates quickly, like Gamma waves (γ, ~40Hz), to execute the fine-grained computational steps.

These two modules work together, allowing the model to achieve significant computational depth while remaining stable and efficient to train.

Astonishing Performance The results speak for themselves (see charts on the right). On tasks requiring complex, precise reasoning, HRM dramatically outperforms much larger models:

Extreme Sudoku (9x9): HRM achieves 55.0% accuracy. Other models, including direct prediction and larger LLMs like Claude 3.7 8K, score 0.0%.

Hard Maze (30x30): HRM finds the optimal path 74.5% of the time. Again, others score 0.0%.

ARC-AGI Benchmark: On the Abstraction and Reasoning Corpus (ARC), a key test for AGI capabilities, HRM significantly outperforms larger models with much longer context windows.

We believe HRM represents a transformative step towards more general and efficient reasoning systems. It shows that a carefully designed architecture can sometimes beat brute-force scale.

We'd love to hear your thoughts on this approach! What other applications could you see for a model like this?

Paper: https://arxiv.org/abs/2506.21734 Code: https://github.com/sapientinc/HRM