frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
598•klaussilveira•11h ago•177 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
905•xnx•17h ago•545 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
22•helloplanets•4d ago•17 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
97•matheusalmeida•1d ago•23 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
28•videotopia•4d ago•0 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
204•isitcontent•11h ago•24 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
202•dmpetrov•12h ago•92 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
314•vecti•14h ago•137 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
353•aktau•18h ago•178 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
357•ostacke•17h ago•93 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
459•todsacerdoti•19h ago•231 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
24•romes•4d ago•3 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
259•eljojo•14h ago•156 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
80•quibono•4d ago•20 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
393•lstoll•18h ago•267 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
53•kmm•4d ago•3 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
7•bikenaga•3d ago•1 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
235•i5heu•14h ago•178 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
4•jesperordrup•1h ago•0 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
47•gfortaine•9h ago•14 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
123•SerCe•7h ago•103 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
136•vmatsiiako•16h ago•60 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
68•phreda4•11h ago•12 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
271•surprisetalk•3d ago•37 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1045•cdrnsf•21h ago•431 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
25•gmays•6h ago•7 comments

Zlob.h 100% POSIX and glibc compatible globbing lib that is faste and better

https://github.com/dmtrKovalenko/zlob
14•neogoose•4h ago•9 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
171•limoce•3d ago•92 comments

FORTH? Really!?

https://rescrv.net/w/2026/02/06/associative
60•rescrv•19h ago•22 comments

Show HN: ARM64 Android Dev Kit

https://github.com/denuoweb/ARM64-ADK
15•denuoweb•1d ago•2 comments
Open in hackernews

From Memorization to Reasoning in the Spectrum of Loss Curvature

https://arxiv.org/abs/2510.24256
65•andy12_•3mo ago

Comments

andy12_•3mo ago
Very concise summary of the procedure described in this paper:

1. Run the model once across a dataset to estimate loss curvature per MLP weight matrix via K-FAC (activation/gradient covariances).

2. Decompose each weight matrix into curvature-ordered components; low-curvature directions correspond most to verbatim memorization, higher curvature to shared/general mechanisms.

3. Edit by dropping the low-curvature subspace and keep only the top directions.

vessenes•3mo ago
Thank you for this huge time saver.

Now, about the paper-that’s super interesting. I imagine the dream here is to distil down into a “reasoning” core. Or maybe reclaim space for more generalization. Lots of interesting use cases.

getnormality•3mo ago
Thank you!

I think you may have accidentally switched low and high in #2, no? The abstract speaks of high curvature as associated with memorization:

> curvature for memorized training points is much sharper than non memorized

radarsat1•3mo ago
This sounds more correct to me. I've read previously somewhere that better generalization is usually associated with wider, smoother minima, and this is why regularization is important, because it has a smoothing function on the loss landscape.
getnormality•3mo ago
Yes. This is also not hard to see intuitively from scratch.

Say you have a smooth but highly flexible model y = f(x) and some data points you are fitting with a machine learning algorithm. For whatever reason, the algorithm decides it wants to reduce training error by interpolating some specific point, (x0,y0), without negatively affecting training error on nearby points. The direct, guaranteed successful way to do this is to adjust the model to y0 = f(x0) exactly on x0 by adding a Dirac delta there, leaving the rest of f exactly as-is. But this cannot be done on a differentiable model, as it would create a discontinuity. The next best thing that such a model can actually do is replace the Dirac delta with a smooth but very narrow bump (e.g. Gaussian). But this narrow bump will inevitably have extremely high curvature at x0, since the bump is flat at x0 and it has to merge with the neighborhood around x0 in a very short distance.

Think of driving: if you have to change lanes in a very short distance, you're going to have to steer hard. Steering is curvature.

woadwarrior01•3mo ago
That's very reminiscent of the idea behind the SAM (Sharpness Aware Minimization) family of optimizers.
andy12_•3mo ago
Actually, no! Look at this in the paper

> In extending from studying per-example to bulk memorization, we propose a novel inversion of the previous interpretation of loss curvature: while individual memorized points are associated with high curvature, the direction of curvature varies across examples, meaning that, averaged across multiple examples, memorization directions are actually flatter than generalizing directions, which maintain a consistent moderate curvature across points

getnormality•3mo ago
Ah! I figured I should be very circumspect in the question since I hadn't read in full and there could be some crazy reason it's actually the opposite.
vatsachak•3mo ago
The decomposition they use "averages out the points of high curvature" therefore those components of the decomposition which correspond to "higher curvature" are those components which are used across multiple data points. Therefore they are the "general reasoning"
kingstnap•3mo ago
A very similar idea is presented here in the first 5 minutes of this recent talk. But more from observing a kink in loss curves.

https://youtu.be/UyK3DgWY7yw?si=NN3f9Erik8o_Nfbs

NitpickLawyer•3mo ago
> Our work enhances the understanding of memorization in neural networks with practical applications towards removing it

Cool stuff. In a recent podcast Karpathy was also talking about this. He sees this as the next "target": models that don't memorise, because you can look it up in an oracle, but still keep the "reasoning" qualities.

esafak•3mo ago
How can you generalize without facts? They are the foundation on which generalization is built. Like programming without memorizing the keywords. Unless you make a distinction between facts that let you generalize, and facts that do not, like random ID numbers.
icandoit•3mo ago
We want the LLM to learn the multiplication algorithm not an incomplete set of tables. The algorithm might be smaller and will be more complete.

Honestly, our technology has outpaced our epistemology. So we don't really know what a fact is or isn't. Are facts what we call our supervised learning experiences? You think the sun rises, no the earth spins. Your belief that the sun rises helps you predict sunset and sunrise. Your belief would be quaint to someone born and raised on a space station. Apollos chariot moves the sun across the sky doesn't it?

esafak•3mo ago
There is a related line of work that suggests spikes in the ESD are related to the generalization vs memorization too; e.g.,

From Spikes to Heavy Tails: Unveiling the Spectral Evolution of Neural Networks (https://openreview.net/pdf?id=DJHB8eBUnt)