frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Show HN: I trained a language model that thinks the capital of Japan is Paris

https://hamiltonianresearch.xyz/blog/hr-diffuse-1.html
14•farisallafi•5h ago

Comments

farisallafi•5h ago
Author here. This is hr-diffuse-1-nano: bidirectional Mamba-2 + LLaDA-style masked diffusion at 288M params, cross- arch distilled from SmolLM-135M, trained on 1xh100 for ~$500.

The honest headline results: 14% infill recovery where autoregressive models score ~0 (they can't condition on text after the blank), 7.5% repetition-loop rate vs 37.5% for the teacher, and a genuinely negative result I think is the most useful part: six different self-correction methods all failed at this scale, while a 300k-param external critic head detects errors far above chance. Small models don't doubt; they rationalize.

Weights are open: https://huggingface.co/devnull37/hr-diffuse-1-nano. Happy to answer anything about the architecture, the failed runs.

versteegen•1h ago
Hi!

I'm pretty busy, so I only skimmed the article, but it's actually really interesting, and also informative as I'm not familiar with diffusion models. Maybe I'll some ask questions/write later. I do want to encourage you, but, honestly the websites are a bit over the top and there's no way to know how much human input actually went in to them.

Experimental science is very messy, as you've learnt. Agreed with the other commenter, there's value (for others and especially yourself) in writing down what went wrong, and the things in the "Small models cannot judge themselves" is so reminiscent of failure modes I've experienced myself. There are usually awful or subtle bugs, training just doesn't work, and even if the results are "interesting" rather than "bad", it can still be incredibly difficult to decide what to conclude from them. To distill knowledge from observations/experiments is the problem of science. You read papers about experiments seem neat and the results profound, but the truth is they're probably a mess too and the evidence for the conclusions is probably a lot weaker than it looks; ML experiments can be unreproducible too.

I suggest that you were running experiments at too large a scale given your resources: you should try to sort out these critical issues on a smaller scale. Yes, the painful problem with ML is that things change qualitatively with scale, you just don't know if a larger scale will fix your issues. But most of these bugs didn't need scale to discover. Think about how you could have more easily discovered them.

Sorry to tell you that your comment was dead (silently blocked, invisible to most users) until I vouched for it. Don't be discouraged from posting on HN. Clearly both you're a real person, and you wrote this with an LLM (quite understandably), but people are really put off by text that smells LLM generated, and it's really easy to tell. HN is flooded with LLM comments lately, they go dead. You can use an LLM to help write, but don't let it determine the content, be genuine, and make sure it doesn't read like one. They can write in any style.

hyperbovine•1h ago
I have to ask: the middle paragraph of this comment reads exactly like something that Codex wrote. Exactly. Is that what happened, or have you spent so much time with these models that you started writing like them?
preetham_rangu•5h ago
Really impressive for a 13 year old, and refreshingly honest writeup. The failed self-correction section is the best part: six methods tried, six negative results reported instead of buried. That's rarer than the architecture itself. Curious whether the shared+LoRA bidirectionality idea holds up once you run it past 2000 steps.
ungreased0675•54m ago
I would like this a lot more if you wrote it yourself, and if it wasn’t an ask for money.

Playing with agents can get expensive quickly, please be careful.

Shadcn/UI now defaults to Base UI instead of Radix

https://ui.shadcn.com/docs/changelog
185•dabinat•8h ago•70 comments

Cannabis Users Face Substantially Higher Risk of Heart Attack

https://www.acc.org/about-acc/press-releases/2025/03/17/15/35/cannabis-users-face-substantially-h...
13•RickJWagner•1h ago•5 comments

If you're a button, you have one job

https://unsung.aresluna.org/if-youre-a-button-you-have-one-job/
322•nozzlegear•11h ago•160 comments

Claude Design System Prompt

https://github.com/Trystan-SA/claude-design-system-prompt
56•handfuloflight•4h ago•13 comments

Functional Programming in hica

https://www.hica.dev/docs/functional-programming/
20•cladamski79•3d ago•4 comments

GPT-5.5 Codex reasoning-token clustering may be leading to degraded performance

https://github.com/openai/codex/issues/30364
319•maille•15h ago•121 comments

Introduction to Compilers and Language Design

https://dthain.github.io/books/compiler/
7•AlexeyBrin•1h ago•0 comments

Fast Software, the Best Software (2019)

https://craigmod.com/essays/fast_software/
61•ustad•5h ago•28 comments

Trust your compiler: Modern C++

https://categorica.io/blog/2026.06.29_trust_your_compiler/
13•foxhill•3d ago•3 comments

Pandoc Lua Filters

https://pandoc.org/lua-filters.html
90•ankitg12•2d ago•5 comments

Show HN: KiCad in the Browser

https://demo.pcbjam.com/
5•ViktorEE•1h ago•1 comments

Knowledge Should Not Be Gated

https://www.formaly.io/blog/knowledge-should-not-be-gated
28•nezhar•5h ago•5 comments

Jellyfish can heal wounds in minutes. Scientists want their secrets

https://www.mbl.edu/news/jellyfish-can-heal-wounds-minutes-scientists-want-their-secrets
155•hhs•14h ago•34 comments

Scientist who cleaned space toilet on work now leading Mars exploration

https://www.bbc.com/news/articles/cz758x04g83o
20•saikatsg•3h ago•5 comments

Megawatts by Microwave

https://computer.rip/2026-07-04-microwave-and-power.html
36•eternauta3k•7h ago•4 comments

Moby Dick Workout (2022)

https://www.hogbaysoftware.com/posts/moby-dick-workout/
59•helloplanets•8h ago•18 comments

Command and Conquer Generals natively ported to macOS, iPhone, iPad using Fable

https://github.com/ammaarreshi/Generals-Mac-iOS-iPad/tree/main
601•asronline•17h ago•256 comments

Artful Cats: Feline-Inspired Art and Artifacts

https://www.si.edu/spotlight/art-cats
55•jruohonen•3d ago•4 comments

Meta's Un-Stable Signature

https://hackerfactor.com/blog/index.php?/archives/1098-Metas-Un-Stable-Signature.html
108•ementally•3d ago•15 comments

The Log is the Agent

https://arxiv.org/abs/2605.21997
61•iacguy•10h ago•18 comments

Atomic Force Microscope [video]

https://www.youtube.com/watch?v=DyIQkqBXhS0
87•mhb•2d ago•9 comments

What ORMs have taught me: just learn SQL (2014)

https://wozniak.ca/blog/2014/08/03/1/index.html
225•ciconia•4d ago•251 comments

Return of the Nigerian Prince Redux: Beware Book Club and Book Review Scams (2025)

https://writerbeware.blog/2025/09/19/return-of-the-nigerian-prince-redux-beware-book-club-and-boo...
64•Anon84•12h ago•21 comments

“Beyond the limit”: Satellites and mirrors in space pose threat to the night sky

https://www.eso.org/public/news/eso2607/
159•Breadmaker•19h ago•260 comments

Dark mode with web standards

https://olliewilliams.xyz/blog/dark-mode/
15•thm•5h ago•5 comments

My ASN Journey series (2024)

https://www.animmouse.com/p/my-asn-journey/
29•antonalekseev•8h ago•11 comments

About the Digital Art

https://www.tricivenola.com/about-the-digital-art/
16•NaOH•3d ago•3 comments

Reducing Assumptions, Exploding Your Code

https://ryelang.org/blog/posts/reducing_assumptions_but_exploding/
18•mpweiher•5h ago•6 comments

Drone Autonomy (2021)

https://www.cggonzalez.com/blog/index.html
70•cgg1•13h ago•6 comments

The Engineer in the Half-Space

https://yusufaytas.com/the-engineer-in-the-half-space
22•yusufaytas•45m ago•1 comments