frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Schedule-Free Lion Optimizer

https://github.com/govorunov/lion-sf
1•quantosaurus•4mo ago

Comments

quantosaurus•4mo ago
While working on new ML architectures I struggled to stabilize training by using countless learning-rate schedulers, gradient clippers and normalizers enough to go and implement a schedule-free optimizer.

Here, Lion Schedule-Free optimizer - a version of Lion optimizer that requires no learning-rate scheduler. It uses sign agreement - an absolute value of cross correlation between momentum sign and gradient sign, to scale the effective update step. Not only it converges 3x times faster ON MY MODEL, by eliminating LR scheduler it also allows for hot training resume & restart. And also stabilizes training, especially late training, eliminating the need for gradient clipping, etc. The effective update depends on the training regime and can decrease or increase during training. In this implementation, the sign agreement is calculated per-module. It's probably more logical and stable to calculate it per-parameter-group, but that's more code and since module-wise already works pretty well...

The optimizer is provided as is. There will be no paper, no convergence guarantees, no ablation studies and no time to do any of that.

Install it:

pip install git+https://github.com/govorunov/lion-sf.git

And use it as normal optimizer:

from lion_pytorch import LionSF

optimizer = LionSF(model.parameters(), lr=5e-4, betas=(0.9, 0.99), weight_decay=1e-2) Give it a generous base learning rate, like 5e-4 or more, and ditch LR scheduler completely. You can also ditch gradient clipping (as I did).

If you want to resume / restart training later from a checkpoint - keep the optimizer state, do a hot-restart. There is no need to warm-up - it will restart gently naturally. The ability to do a hot-restart and increased training stability is probably more important (for me) than even faster convergence, although faster convergence looks better on plots.

Elon Musk on Space GPUs, AI, Optimus, and His Manufacturing Method

https://cheekypint.substack.com/p/elon-musk-on-space-gpus-ai-optimus
1•simonebrunozzi•1m ago•0 comments

X (Twitter) is back with a new X API Pay-Per-Use model

https://developer.x.com/
2•eeko_systems•8m ago•0 comments

Zlob.h 100% POSIX and glibc compatible globbing lib that is faste and better

https://github.com/dmtrKovalenko/zlob
1•neogoose•11m ago•1 comments

Show HN: Deterministic signal triangulation using a fixed .72% variance constant

https://github.com/mabrucker85-prog/Project_Lance_Core
1•mav5431•12m ago•1 comments

Scientists Discover Levitating Time Crystals You Can Hold, Defy Newton’s 3rd Law

https://phys.org/news/2026-02-scientists-levitating-crystals.html
1•sizzle•12m ago•0 comments

When Michelangelo Met Titian

https://www.wsj.com/arts-culture/books/michelangelo-titian-review-the-renaissances-odd-couple-e34...
1•keiferski•13m ago•0 comments

Solving NYT Pips with DLX

https://github.com/DonoG/NYTPips4Processing
1•impossiblecode•13m ago•1 comments

Baldur's Gate to be turned into TV series – without the game's developers

https://www.bbc.com/news/articles/c24g457y534o
2•vunderba•14m ago•0 comments

Interview with 'Just use a VPS' bro (OpenClaw version) [video]

https://www.youtube.com/watch?v=40SnEd1RWUU
1•dangtony98•19m ago•0 comments

EchoJEPA: Latent Predictive Foundation Model for Echocardiography

https://github.com/bowang-lab/EchoJEPA
1•euvin•27m ago•0 comments

Disablling Go Telemetry

https://go.dev/doc/telemetry
1•1vuio0pswjnm7•29m ago•0 comments

Effective Nihilism

https://www.effectivenihilism.org/
1•abetusk•32m ago•1 comments

The UK government didn't want you to see this report on ecosystem collapse

https://www.theguardian.com/commentisfree/2026/jan/27/uk-government-report-ecosystem-collapse-foi...
3•pabs3•34m ago•0 comments

No 10 blocks report on impact of rainforest collapse on food prices

https://www.thetimes.com/uk/environment/article/no-10-blocks-report-on-impact-of-rainforest-colla...
2•pabs3•34m ago•0 comments

Seedance 2.0 Is Coming

https://seedance-2.app/
1•Jenny249•36m ago•0 comments

Show HN: Fitspire – a simple 5-minute workout app for busy people (iOS)

https://apps.apple.com/us/app/fitspire-5-minute-workout/id6758784938
1•devavinoth12•36m ago•0 comments

Dexterous robotic hands: 2009 – 2014 – 2025

https://old.reddit.com/r/robotics/comments/1qp7z15/dexterous_robotic_hands_2009_2014_2025/
1•gmays•40m ago•0 comments

Interop 2025: A Year of Convergence

https://webkit.org/blog/17808/interop-2025-review/
1•ksec•50m ago•1 comments

JobArena – Human Intuition vs. Artificial Intelligence

https://www.jobarena.ai/
1•84634E1A607A•53m ago•0 comments

Concept Artists Say Generative AI References Only Make Their Jobs Harder

https://thisweekinvideogames.com/feature/concept-artists-in-games-say-generative-ai-references-on...
1•KittenInABox•57m ago•0 comments

Show HN: PaySentry – Open-source control plane for AI agent payments

https://github.com/mkmkkkkk/paysentry
2•mkyang•59m ago•0 comments

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

https://moli-green.is/
2•ShinyaKoyano•1h ago•1 comments

The Crumbling Workflow Moat: Aggregation Theory's Final Chapter

https://twitter.com/nicbstme/status/2019149771706102022
1•SubiculumCode•1h ago•0 comments

Pax Historia – User and AI powered gaming platform

https://www.ycombinator.com/launches/PMu-pax-historia-user-ai-powered-gaming-platform
2•Osiris30•1h ago•0 comments

Show HN: I built a RAG engine to search Singaporean laws

https://github.com/adityaprasad-sudo/Explore-Singapore
3•ambitious_potat•1h ago•4 comments

Scams, Fraud, and Fake Apps: How to Protect Your Money in a Mobile-First Economy

https://blog.afrowallet.co/en_GB/tiers-app/scams-fraud-and-fake-apps-in-africa
1•jonatask•1h ago•0 comments

Porting Doom to My WebAssembly VM

https://irreducible.io/blog/porting-doom-to-wasm/
2•irreducible•1h ago•0 comments

Cognitive Style and Visual Attention in Multimodal Museum Exhibitions

https://www.mdpi.com/2075-5309/15/16/2968
1•rbanffy•1h ago•0 comments

Full-Blown Cross-Assembler in a Bash Script

https://hackaday.com/2026/02/06/full-blown-cross-assembler-in-a-bash-script/
1•grajmanu•1h ago•0 comments

Logic Puzzles: Why the Liar Is the Helpful One

https://blog.szczepan.org/blog/knights-and-knaves/
1•wasabi991011•1h ago•0 comments