frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Continual Learning Problem

https://jessylin.com/2025/10/20/continual-learning/
102•Bogdanp•3mo ago

Comments

optimalsolver•3mo ago
Rather than handcrafting solutions like it’s 1993, why not make robustness against forgetting part of the training objective?

Let the search algorithm figure it out.

vessenes•3mo ago
The reason you're getting slightly downvoted, I think, is that you need to answer this question first: which of the 15T tokens are you going to evaluate for forgetting? And, please explain how doing that is different than doing another full epoch type pass over the weights.

Some of the appeal here is that this architecture (handcrafted) allows ongoing gradient descent learning as you go on a much smaller set of weights.

intalentive•3mo ago
Funny you say that, this write-up recalled Stephen Grossberg's Adaptive Resonance Theory for me. The same basic ideas come up when addressing the stability-plasticity dilemma.

That said, the authors are saving this for future work. Fine-tuning is cheaper, easier, faster to validate.

>Switching to a new architecture at pretraining time has a high cost, but there are reasons we might want this (besides the better scaling behavior). The main benefit is that the model can learn to organize its memory from scratch, and once we’ve already “allocated” this high-capacity memory pool, there’s a clearer path to learning on multiple tasks and corpora over time.

This means you could "fine-tune" the model on your custom corpus at ingestion time, without having to actually train via backprop. Your corpus would be compressed into model-readable memory that updates model behavior. Then different memory units could be swapped in and out, like programs on a floppy disk. I can see this concept being especially useful for robotics.

yorwba•3mo ago
The memory is model-readable but not model-writable, so you still need to train via backprop to get the memory to store useful data.
imtringued•3mo ago
Elastic weight consolidation is already a thing and it's not enough.
esafak•3mo ago
Great writeup. Are there any libraries that implement some of the methods described?
gdiamos•3mo ago
ScalarLM uses tokenformer adaptors by default, which have learnable key/values

https://www.scalarlm.com/blog/tokenformer-a-scalable-transfo...

skeptrune•3mo ago
I appreciate that people are going beyond RAG and few shot prompting.

Matchlock – Secures AI agent workloads with a Linux-based sandbox

https://github.com/jingkaihe/matchlock
69•jingkai_he•6h ago•25 comments

Why E cores make Apple silicon fast

https://eclecticlight.co/2026/02/08/last-week-on-my-mac-why-e-cores-make-apple-silicon-fast/
81•ingve•2h ago•69 comments

Dave Farber has died

https://lists.nanog.org/archives/list/nanog@lists.nanog.org/thread/TSNPJVFH4DKLINIKSMRIIVNHDG5XKJCM/
69•vitplister•2h ago•11 comments

Curating a Show on My Ineffable Mother, Ursula K. Le Guin

https://hyperallergic.com/curating-a-show-on-my-ineffable-mother-ursula-k-le-guin/
30•bryanrasmussen•4h ago•13 comments

Reverse Engineering Raiders of the Lost Ark for the Atari 2600

https://github.com/joshuanwalker/Raiders2600
31•pacod•5h ago•1 comments

DoNotNotify is now Open Source

https://donotnotify.com/opensource.html
254•awaaz•6h ago•44 comments

Show HN: It took 4 years to sell my startup. I wrote a book about it

https://derekyan.com/ma-book/
17•zhyan7109•3d ago•4 comments

Beyond agentic coding

https://haskellforall.com/2026/02/beyond-agentic-coding
174•RebelPotato•12h ago•58 comments

Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory

https://github.com/localgpt-app/localgpt
271•yi_wang•12h ago•131 comments

Show HN: Fine-tuned Qwen2.5-7B on 100 films for probabilistic story graphs

https://cinegraphs.ai/
41•graphpilled•2h ago•10 comments

Slop Terrifies Me

https://ezhik.jp/ai-slop-terrifies-me/
93•Ezhik•3h ago•80 comments

Rabbit Ear "Origami": programmable origami in the browser

https://rabbitear.org/book/origami.html
44•molszanski•3d ago•3 comments

A11yJSON: A standard to describe the accessibility of the physical world

https://sozialhelden.github.io/a11yjson/
25•robin_reala•5d ago•3 comments

The Legacy of Daniel Kahneman: A Personal View (2025)

https://ejpe.org/journal/article/view/1075/753
25•cainxinth•3d ago•4 comments

We mourn our craft

https://nolanlawson.com/2026/02/07/we-mourn-our-craft/
435•ColinWright•19h ago•585 comments

SectorC: A C Compiler in 512 bytes (2023)

https://xorvoid.com/sectorc.html
332•valyala•20h ago•67 comments

LLMs as the new high level language

https://federicopereiro.com/llm-high/
156•swah•5d ago•295 comments

I write games in C (yes, C) (2016)

https://jonathanwhiting.com/writing/blog/games_in_c/
208•valyala•20h ago•228 comments

OpenClaw Is Changing My Life

https://reorx.com/blog/openclaw-is-changing-my-life/
29•novoreorx•8h ago•63 comments

The Architecture of Open Source Applications (Volume 1) Berkeley DB

https://aosabook.org/en/v1/bdb.html
58•grep_it•5d ago•8 comments

Software factories and the agentic moment

https://factory.strongdm.ai/
252•mellosouls•23h ago•409 comments

Arcan Explained – A browser for different webs

https://arcan-fe.com/2026/01/26/arcan-explained-a-browser-for-different-webs/
9•walterbell•5h ago•0 comments

Speed up responses with fast mode

https://code.claude.com/docs/en/fast-mode
205•surprisetalk•20h ago•217 comments

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/
203•AlexeyBrin•1d ago•43 comments

Roger Ebert Reviews "The Shawshank Redemption" (1999)

https://www.rogerebert.com/reviews/great-movie-the-shawshank-redemption-1994
42•monero-xmr•8h ago•51 comments

Stories from 25 Years of Software Development

https://susam.net/twenty-five-years-of-computing.html
224•vinhnx•23h ago•26 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
392•jesperordrup•1d ago•125 comments

uLauncher

https://github.com/jrpie/launcher
48•dtj1123•5d ago•18 comments

Modern and Antique Technologies Reveal a Dynamic Cosmos

https://www.quantamagazine.org/how-modern-and-antique-technologies-reveal-a-dynamic-cosmos-20260202/
12•sohkamyung•5d ago•0 comments

Brookhaven Lab's RHIC concludes 25-year run with final collisions

https://www.hpcwire.com/off-the-wire/brookhaven-labs-rhic-concludes-25-year-run-with-final-collis...
88•gnufx•19h ago•65 comments