frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Start all of your commands with a comma (2009)

https://rhodesmill.org/brandon/2009/commands-with-comma/
306•theblazehen•2d ago•102 comments

Software Engineering Is Back

https://blog.alaindichiappari.dev/p/software-engineering-is-back
36•alainrk•1h ago•29 comments

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/
40•AlexeyBrin•2h ago•6 comments

France's homegrown open source online office suite

https://github.com/suitenumerique
18•nar001•50m ago•8 comments

Reinforcement Learning from Human Feedback

https://arxiv.org/abs/2504.12501
19•onurkanbkrc•1h ago•1 comments

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
719•klaussilveira•16h ago•221 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
105•jesperordrup•6h ago•38 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
983•xnx•22h ago•562 comments

Ga68, a GNU Algol 68 Compiler

https://fosdem.org/2026/schedule/event/PEXRTN-ga68-intro/
21•matt_d•3d ago•4 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
78•videotopia•4d ago•12 comments

Making geo joins faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
141•matheusalmeida•2d ago•37 comments

Cross-Region MSK Replication: K2K vs. MirrorMaker2

https://medium.com/lensesio/cross-region-msk-replication-a-comprehensive-performance-comparison-o...
5•andmarios•4d ago•1 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
242•isitcontent•16h ago•27 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
245•dmpetrov•17h ago•128 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
346•vecti•18h ago•153 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
511•todsacerdoti•1d ago•248 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
395•ostacke•22h ago•102 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
47•helloplanets•4d ago•48 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
310•eljojo•19h ago•192 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
363•aktau•23h ago•189 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
441•lstoll•23h ago•289 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
77•kmm•5d ago•11 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
98•quibono•4d ago•22 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
26•bikenaga•3d ago•14 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
47•gmays•11h ago•19 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
281•i5heu•19h ago•229 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1092•cdrnsf•1d ago•472 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
160•vmatsiiako•21h ago•73 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
312•surprisetalk•3d ago•45 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
36•romes•4d ago•3 comments
Open in hackernews

Hierarchical Modeling (H-Nets)

https://cartesia.ai/blog/hierarchical-modeling
93•lukebechtel•6mo ago

Comments

lukebechtel•6mo ago
> H-Net demonstrates three important results on language modeling:

> 1. H-Nets scale better with data than state-of-the-art Transformers with BPE tokenization, while learning directly from raw bytes. This improved scaling is even more pronounced on domains without natural tokenization boundaries, like Chinese, code, and DNA.

> 2. H-Nets can be stacked together to learn from deeper hierarchies, which further improves performance.

> 3. H-Nets are significantly more robust to small perturbations in input data like casing, showing an avenue for creating models that are more robust and aligned with human reasoning.

lukebechtel•6mo ago
https://arxiv.org/pdf/2507.07955

paper

modeless•6mo ago
I don't know if this is the one but something like this is clearly the future IMO. We need more levels of hierarchy to efficiently generalize to longer sequences with high level structure. Back when Byte Latent Transformers came out I thought extending the idea to more levels of hierarchy was the way to go, and this seems to be basically that?

Another article about H-Nets: https://main-horse.github.io/posts/hnet-inf/

blurbleblurble•6mo ago
Yes... This seems like a generalization of "large concept models" in a certain way
cs702•6mo ago
I've only skimmed the paper, but it looks interesting and credible, so I've added it to my reading list.

Thank you for sharing on HN!

---

EDIT: The hierarchical composition and routing aspects of this work vaguely remind me of https://github.com/glassroom/heinsen_routing/ but it has been a while since I played with that. UPDATE: After spending a bit more time on the OP, it's different, but the ideas are related, like routing based on similarity.

lukebechtel•6mo ago
No problem! I'm still parsing it myself, but it seems promising in theory, and the result curves are impressive.
gdiamos•6mo ago
How does it handle images?
marviel•6mo ago
it mentions native multimodality somewhere in either the Arxiv or post -- seems like it might handle it well?
miven•6mo ago
As far as I understand the "chunking" of input bytes is learned completely end to end, so it's basically up to the model to figure out how to most efficiently delineate and aggregate the information from the inputs according to the patterns provided to it during training.

Since it's end to end this allows them to apply this process not only to raw byte encodings but basically representations of any level, such as stacking two stages of aggregation one after another.

So in principle they could either let the model do its thing on raw bytes of an image or alternatively maybe cut it up into tiny patches ViT-style and feed that to their H-Net.

I wonder how hard would it be to adapt chunking to work in 2D and what would that even look like.

Some other notes on how multimodal inputs could be handled using this architecture are mentioned in Albert Gu's (one of the author's) blog, although only briefly, there's still much to figure out it would seem: https://goombalab.github.io/blog/2025/hnet-future/#alternati...

marviel•6mo ago
Thanks for sharing this blog post is a great speculative deep-dive.
andyferris•6mo ago
You can make image networks (unet-like things) by chunking rectangles in 2D (with some convolution steps)... I wonder if there is an image-specific architecture a bit like this that could possibly work well?
cubefox•6mo ago
Perhaps something like this: https://neurips.cc/virtual/2024/poster/94115 Though I haven't looked up what their actual tokenization strategy is, and whether switching to hierarchical (H-Net) chunks would be possible.
aeon_ai•6mo ago
Seems likely to be relevant for memory formation/consolidation/management.

Big, if so.

cubefox•6mo ago
As Mamba didn't make it, will H-Nets replace Transformers?
lukebechtel•6mo ago
It's meant to replace the BPE tokenizer piece, so it isn't a full Language Model by itself.

In fact in Gu's blog post (linked in a post below) it's mentioned that they created a Mamba model that used this in place of the tokenizer.

yorwba•6mo ago
Their architecture uses a mix of Transformer and Mamba layers. The question isn't whether it will replace Transformers, but whether it'll become part of the toolkit or whether it'll get abandoned like many other promising approaches.
vannevar•6mo ago
>The best AI architectures in use today treat all inputs equally.

Doesn't this architecture also treat all inputs equally? It seems like an encoder that preprocesses the input by inferring hierarchy. But don't all models essentially do that while training?

modeless•6mo ago
If I understand correctly, each level of the hierarchy divides its input into chunks of variable size, but outputs a fixed amount for each chunk. The chunking is learned. The model can choose to compress data by making its input chunks bigger, depending on their content.
blurbleblurble•6mo ago
Hand wavy idea: I wonder if we couldn't take this to another level and have some kind of general graph representation along with hierarchical reductions of it.

I sort of disagree with the assertion that "language is fundamentally hierarchical" in that it supposes there is a single abstraction hierarchy that's universally preferable or correct. That's just not true. It doesn't hurt anybody and it's definitely simpler to choose just one useful one (a hierarchy) but why learn only one? Why not learn multiple and also learn how to modulate between them?

notreallymetho•6mo ago
I haven’t read fully yet, but it reminds me of some work I’ve done. https://github.com/jamestexas/papers/blob/main/bread/paper.m...
astrange•6mo ago
> 3. H-Nets are significantly more robust to small perturbations in input data like casing, showing an avenue for creating models that are more robust and aligned with human reasoning.

If it forms a hierarchy (a tree), it seems like it wouldn't be robust to rearranging the information in a prompt.

eg if your request has a long list or a table of data, all the different permutations of that will create different trees even though they're actually the same thing.