frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Modern Optimizers – An Alchemist's Notes on Deep Learning

https://notes.kvfrans.com/7-misc/modern-optimizers.html
46•maxall4•3mo ago

Comments

derbOac•2mo ago
Interesting read and interesting links.

The entry asks "why the square root?"

On seeing it, I immediately noticed that with log-likelihood as the loss function, the whitening metric looks a lot like the Jeffreys prior or an approximation (https://en.wikipedia.org/wiki/Jeffreys_prior), which is a reference prior when the CLT holds. The square root can be derived from the reference prior structure, but also has the effect in a lot of modeling scenarios of scaling things proportionally to the scale of the parameters (for lack of a better way of putting it; think standard error versus sampling variance).

If you think of the optimization method this way, you're essentially reconstructing a kind of Bayesian criterion with a Jeffreys prior.

big-chungus4•2mo ago
the square root is from PCA/ZCA whitening, what it does it it makes empirical covariance of gradients become identity, so they become decorellated, which is exactly what hessian does on a quadratic objective by the way
big-chungus4•2mo ago
https://en.wikipedia.org/wiki/Whitening_transformation for ZCA whitening
big-chungus4•2mo ago
>Likely, there is a method that can use the orthogonalization machinery of Muon while keeping the signal-to-noise estimation of Adam, and this optimizer will be great.

if you take SOAP and change all betas to 0, it still works well, so SOAP is that already

big-chungus4•2mo ago
which PSGD did you use because there is apparenly like a million of them
big-chungus4•2mo ago
I personally think we've hit the limit and no more better optimizers are to be developed in my humble opinion
big-chungus4•2mo ago
best we can do is something like make SOAP faster by replacing QR with something cheaper and maybe warm started

Zen Tools

http://postmake.io/zen-list
1•Malfunction92•34s ago•0 comments

Is the Detachment in the Room? – Agents, Cruelty, and Empathy

https://hailey.at/posts/3mear2n7v3k2r
1•carnevalem•56s ago•0 comments

The purpose of Continuous Integration is to fail

https://blog.nix-ci.com/post/2026-02-05_the-purpose-of-ci-is-to-fail
1•zdw•3m ago•0 comments

Apfelstrudel: Live coding music environment with AI agent chat

https://github.com/rcarmo/apfelstrudel
1•rcarmo•3m ago•0 comments

What Is Stoicism?

https://stoacentral.com/guides/what-is-stoicism
3•0xmattf•4m ago•0 comments

What happens when a neighborhood is built around a farm

https://grist.org/cities/what-happens-when-a-neighborhood-is-built-around-a-farm/
1•Brajeshwar•4m ago•0 comments

Every major galaxy is speeding away from the Milky Way, except one

https://www.livescience.com/space/cosmology/every-major-galaxy-is-speeding-away-from-the-milky-wa...
2•Brajeshwar•4m ago•0 comments

Extreme Inequality Presages the Revolt Against It

https://www.noemamag.com/extreme-inequality-presages-the-revolt-against-it/
1•Brajeshwar•5m ago•0 comments

There's no such thing as "tech" (Ten years later)

1•dtjb•5m ago•0 comments

What Really Killed Flash Player: A Six-Year Campaign of Deliberate Platform Work

https://medium.com/@aglaforge/what-really-killed-flash-player-a-six-year-campaign-of-deliberate-p...
1•jbegley•6m ago•0 comments

Ask HN: Anyone orchestrating multiple AI coding agents in parallel?

1•buildingwdavid•7m ago•0 comments

Show HN: Knowledge-Bank

https://github.com/gabrywu-public/knowledge-bank
1•gabrywu•13m ago•0 comments

Show HN: The Codeverse Hub Linux

https://github.com/TheCodeVerseHub/CodeVerseLinuxDistro
3•sinisterMage•14m ago•2 comments

Take a trip to Japan's Dododo Land, the most irritating place on Earth

https://soranews24.com/2026/02/07/take-a-trip-to-japans-dododo-land-the-most-irritating-place-on-...
2•zdw•14m ago•0 comments

British drivers over 70 to face eye tests every three years

https://www.bbc.com/news/articles/c205nxy0p31o
15•bookofjoe•14m ago•4 comments

BookTalk: A Reading Companion That Captures Your Voice

https://github.com/bramses/BookTalk
1•_bramses•15m ago•0 comments

Is AI "good" yet? – tracking HN's sentiment on AI coding

https://www.is-ai-good-yet.com/#home
2•ilyaizen•16m ago•1 comments

Show HN: Amdb – Tree-sitter based memory for AI agents (Rust)

https://github.com/BETAER-08/amdb
1•try_betaer•17m ago•0 comments

OpenClaw Partners with VirusTotal for Skill Security

https://openclaw.ai/blog/virustotal-partnership
2•anhxuan•17m ago•0 comments

Show HN: Seedance 2.0 Release

https://seedancy2.com/
2•funnycoding•17m ago•0 comments

Leisure Suit Larry's Al Lowe on model trains, funny deaths and Disney

https://spillhistorie.no/2026/02/06/interview-with-sierra-veteran-al-lowe/
1•thelok•18m ago•0 comments

Towards Self-Driving Codebases

https://cursor.com/blog/self-driving-codebases
1•edwinarbus•18m ago•0 comments

VCF West: Whirlwind Software Restoration – Guy Fedorkow [video]

https://www.youtube.com/watch?v=YLoXodz1N9A
1•stmw•19m ago•1 comments

Show HN: COGext – A minimalist, open-source system monitor for Chrome (<550KB)

https://github.com/tchoa91/cog-ext
1•tchoa91•20m ago•1 comments

FOSDEM 26 – My Hallway Track Takeaways

https://sluongng.substack.com/p/fosdem-26-my-hallway-track-takeaways
1•birdculture•20m ago•0 comments

Show HN: Env-shelf – Open-source desktop app to manage .env files

https://env-shelf.vercel.app/
1•ivanglpz•24m ago•0 comments

Show HN: Almostnode – Run Node.js, Next.js, and Express in the Browser

https://almostnode.dev/
1•PetrBrzyBrzek•24m ago•0 comments

Dell support (and hardware) is so bad, I almost sued them

https://blog.joshattic.us/posts/2026-02-07-dell-support-lawsuit
1•radeeyate•25m ago•0 comments

Project Pterodactyl: Incremental Architecture

https://www.jonmsterling.com/01K7/
1•matt_d•25m ago•0 comments

Styling: Search-Text and Other Highlight-Y Pseudo-Elements

https://css-tricks.com/how-to-style-the-new-search-text-and-other-highlight-pseudo-elements/
1•blenderob•27m ago•0 comments