frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Schedule-Free Lion Optimizer

https://github.com/govorunov/lion-sf
1•quantosaurus•2h ago

Comments

quantosaurus•2h ago
While working on new ML architectures I struggled to stabilize training by using countless learning-rate schedulers, gradient clippers and normalizers enough to go and implement a schedule-free optimizer.

Here, Lion Schedule-Free optimizer - a version of Lion optimizer that requires no learning-rate scheduler. It uses sign agreement - an absolute value of cross correlation between momentum sign and gradient sign, to scale the effective update step. Not only it converges 3x times faster ON MY MODEL, by eliminating LR scheduler it also allows for hot training resume & restart. And also stabilizes training, especially late training, eliminating the need for gradient clipping, etc. The effective update depends on the training regime and can decrease or increase during training. In this implementation, the sign agreement is calculated per-module. It's probably more logical and stable to calculate it per-parameter-group, but that's more code and since module-wise already works pretty well...

The optimizer is provided as is. There will be no paper, no convergence guarantees, no ablation studies and no time to do any of that.

Install it:

pip install git+https://github.com/govorunov/lion-sf.git

And use it as normal optimizer:

from lion_pytorch import LionSF

optimizer = LionSF(model.parameters(), lr=5e-4, betas=(0.9, 0.99), weight_decay=1e-2) Give it a generous base learning rate, like 5e-4 or more, and ditch LR scheduler completely. You can also ditch gradient clipping (as I did).

If you want to resume / restart training later from a checkpoint - keep the optimizer state, do a hot-restart. There is no need to warm-up - it will restart gently naturally. The ability to do a hot-restart and increased training stability is probably more important (for me) than even faster convergence, although faster convergence looks better on plots.

Git based CMS for my blog

https://ferrucc.io/posts/git-based-cms-for-my-blog/
1•furkansahin•1m ago•0 comments

Polish scientists' startup Pathway announces AI reasoning breakthrough

https://www.polskieradio.pl/395/7784/artykul/3588855,polish-scientists-startup-pathway-announces-...
1•ludovicianul•1m ago•0 comments

Work and Growth in the AGI World [pdf]

https://www.nber.org/system/files/chapters/c15315/c15315.pdf
1•redbell•3m ago•0 comments

Build Mental Resilience: A 30-Day Challenge Inspired by Science – Geeksta

https://geeksta.net/geeklog/resilience-challenge-introduction/
1•gkst•4m ago•0 comments

The Quiet Driving Force Behind Rising Curtailment Costs in Great Britain

https://ukerc.ac.uk/news/transmission-network-unavailability-the-quiet-driving-force-behind-risin...
1•jayflux•5m ago•0 comments

Everything Is Becoming a Bank

https://jacobin.com/2025/10/bankification-financialization-debt-interest-credit/
1•speckx•6m ago•0 comments

Press Release: Nobel Prize in Physics 2025

https://www.nobelprize.org/prizes/physics/2025/press-release/
1•Anon84•7m ago•0 comments

The neurons that let us see what isn't there

https://arstechnica.com/science/2025/10/the-neurons-that-let-us-see-what-isnt-there/
1•rbanffy•8m ago•0 comments

Show HN: Tired of Losing AI Chat Context? Try Context Saver

https://www.contextsaver.app
1•junhyun82•8m ago•0 comments

Impact of Google's num=100 Removal on 77% of the Web

https://searchengineland.com/google-num100-impact-data-462231
1•oldfuture•8m ago•1 comments

MacBook Lid Angle Sensor

https://github.com/samhenrigold/LidAngleSensor
1•redbell•9m ago•0 comments

AMD Came from Behind to Mount a Challenge in the AI Chip Wars

https://www.wsj.com/tech/ai/amd-openai-chip-deal-nvidia-competitor-37692514
1•doener•10m ago•1 comments

OpenAI's Windows Play

https://stratechery.com/2025/openais-windows-play/
1•jger15•13m ago•0 comments

Show HN: Dromos Console – Build personalized automation agents in ~10 minutes

1•askmuyukani•20m ago•1 comments

ViralGenie AI

https://viralgenie.ai
1•bellamoon544•20m ago•1 comments

You Need to Be Bored. Here's Why [video]

https://www.youtube.com/watch?v=orQKfIXMiA8
1•f1shy•22m ago•0 comments

Powerful and Precise Multi-Color Lasers Now Fit on a Single Chip

https://www.engineering.columbia.edu/about/news/powerful-and-precise-multi-color-lasers-now-fit-s...
2•geox•23m ago•0 comments

Show HN: 88x31.pics

http://88x31.pics/
1•hwj•23m ago•0 comments

Check Out These Gravitational Lenses Imaged by Webb During Its First Run

https://www.universetoday.com/articles/check-out-these-gravitational-lenses-imaged-by-webb-during...
1•rbanffy•24m ago•0 comments

GPT-5-Codex is a better AI researcher than me

https://www.seangoedecke.com/ai-research-with-codex/
4•codeclimber•32m ago•0 comments

Batch Updates and Advanced Inserts in Ecto for Elixir

https://blog.appsignal.com/2025/10/07/batch-updates-and-advanced-inserts-in-ecto-for-elixir.html
1•amalinovic•34m ago•0 comments

The IBM 5100 – By Bradford Morgan White

https://www.abortretry.fail/p/the-history-of-the-ibm-5100
1•rbanffy•35m ago•0 comments

A Responsibility to the Industry

https://lmnt.me/blog/a-responsibility-to-the-industry.html
1•latexr•37m ago•0 comments

Optimizing Rails Tests at Doctolib Scale – On Rails

https://onrails.buzzsprout.com/2462975/episodes/17891804-florent-beaurain-optimizing-rails-tests-...
1•robbyrussell•37m ago•0 comments

What Is the Funniest Number?

https://www.futilitycloset.com/2025/10/06/number-theory/
2•baruchel•38m ago•1 comments

ReviewPlot – Fraud-free review platform for small teams

https://www.reviewplot.com
1•Sinaghodsi93•39m ago•1 comments

Like Vercel, but open source and for all language

https://github.com/hunvreus/devpush
18•el_hacker•40m ago•1 comments

Be Using Pathlib (2018)

https://treyhunner.com/2018/12/why-you-should-be-using-pathlib/
1•birdculture•42m ago•0 comments

Fastlane leaks user login and Apple passwords

https://github.com/fastlane/fastlane/issues/29712
2•serso•44m ago•1 comments

After 9 months, my startup now ranks #1 in Google for OKR Software

https://www.okrstool.com
2•steven_okrstool•55m ago•1 comments