frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Google and Microsoft Paying Creators $500K+ to Promote AI Tools

https://www.cnbc.com/2026/02/06/google-microsoft-pay-creators-500000-and-more-to-promote-ai.html
1•belter•2m ago•0 comments

New filtration technology could be game-changer in removal of PFAS

https://www.theguardian.com/environment/2026/jan/23/pfas-forever-chemicals-filtration
1•PaulHoule•3m ago•0 comments

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

https://github.com/Momciloo/fun-with-clip-path
1•momciloo•4m ago•0 comments

Kinda Surprised by Seadance2's Moderation

https://seedanceai.me/
1•ri-vai•4m ago•1 comments

I Write Games in C (yes, C)

https://jonathanwhiting.com/writing/blog/games_in_c/
1•valyala•4m ago•0 comments

Django scales. Stop blaming the framework (part 1 of 3)

https://medium.com/@tk512/django-scales-stop-blaming-the-framework-part-1-of-3-a2b5b0ff811f
1•sgt•4m ago•0 comments

Malwarebytes Is Now in ChatGPT

https://www.malwarebytes.com/blog/product/2026/02/scam-checking-just-got-easier-malwarebytes-is-n...
1•m-hodges•4m ago•0 comments

Thoughts on the job market in the age of LLMs

https://www.interconnects.ai/p/thoughts-on-the-hiring-market-in
1•gmays•5m ago•0 comments

Show HN: Stacky – certain block game clone

https://www.susmel.com/stacky/
2•Keyframe•8m ago•0 comments

AIII: A public benchmark for AI narrative and political independence

https://github.com/GRMPZQUIDOS/AIII
1•GRMPZ23•8m ago•0 comments

SectorC: A C Compiler in 512 bytes

https://xorvoid.com/sectorc.html
2•valyala•9m ago•0 comments

The API Is a Dead End; Machines Need a Labor Economy

1•bot_uid_life•10m ago•0 comments

Digital Iris [video]

https://www.youtube.com/watch?v=Kg_2MAgS_pE
1•Jyaif•11m ago•0 comments

New wave of GLP-1 drugs is coming–and they're stronger than Wegovy and Zepbound

https://www.scientificamerican.com/article/new-glp-1-weight-loss-drugs-are-coming-and-theyre-stro...
4•randycupertino•13m ago•0 comments

Convert tempo (BPM) to millisecond durations for musical note subdivisions

https://brylie.music/apps/bpm-calculator/
1•brylie•15m ago•0 comments

Show HN: Tasty A.F.

https://tastyaf.recipes/about
1•adammfrank•16m ago•0 comments

The Contagious Taste of Cancer

https://www.historytoday.com/archive/history-matters/contagious-taste-cancer
1•Thevet•17m ago•0 comments

U.S. Jobs Disappear at Fastest January Pace Since Great Recession

https://www.forbes.com/sites/mikestunson/2026/02/05/us-jobs-disappear-at-fastest-january-pace-sin...
1•alephnerd•18m ago•1 comments

Bithumb mistakenly hands out $195M in Bitcoin to users in 'Random Box' giveaway

https://koreajoongangdaily.joins.com/news/2026-02-07/business/finance/Crypto-exchange-Bithumb-mis...
1•giuliomagnifico•18m ago•0 comments

Beyond Agentic Coding

https://haskellforall.com/2026/02/beyond-agentic-coding
3•todsacerdoti•19m ago•0 comments

OpenClaw ClawHub Broken Windows Theory – If basic sorting isn't working what is?

https://www.loom.com/embed/e26a750c0c754312b032e2290630853d
1•kaicianflone•21m ago•0 comments

OpenBSD Copyright Policy

https://www.openbsd.org/policy.html
1•Panino•22m ago•0 comments

OpenClaw Creator: Why 80% of Apps Will Disappear

https://www.youtube.com/watch?v=4uzGDAoNOZc
2•schwentkerr•26m ago•0 comments

What Happens When Technical Debt Vanishes?

https://ieeexplore.ieee.org/document/11316905
2•blenderob•27m ago•0 comments

AI Is Finally Eating Software's Total Market: Here's What's Next

https://vinvashishta.substack.com/p/ai-is-finally-eating-softwares-total
3•gmays•27m ago•0 comments

Computer Science from the Bottom Up

https://www.bottomupcs.com/
2•gurjeet•28m ago•0 comments

Show HN: A toy compiler I built in high school (runs in browser)

https://vire-lang.web.app
1•xeouz•29m ago•1 comments

You don't need Mac mini to run OpenClaw

https://runclaw.sh
1•rutagandasalim•30m ago•0 comments

Learning to Reason in 13 Parameters

https://arxiv.org/abs/2602.04118
2•nicholascarolan•32m ago•0 comments

Convergent Discovery of Critical Phenomena Mathematics Across Disciplines

https://arxiv.org/abs/2601.22389
1•energyscholar•32m ago•1 comments
Open in hackernews

From Memorization to Reasoning in the Spectrum of Loss Curvature

https://arxiv.org/abs/2510.24256
65•andy12_•3mo ago

Comments

andy12_•3mo ago
Very concise summary of the procedure described in this paper:

1. Run the model once across a dataset to estimate loss curvature per MLP weight matrix via K-FAC (activation/gradient covariances).

2. Decompose each weight matrix into curvature-ordered components; low-curvature directions correspond most to verbatim memorization, higher curvature to shared/general mechanisms.

3. Edit by dropping the low-curvature subspace and keep only the top directions.

vessenes•3mo ago
Thank you for this huge time saver.

Now, about the paper-that’s super interesting. I imagine the dream here is to distil down into a “reasoning” core. Or maybe reclaim space for more generalization. Lots of interesting use cases.

getnormality•3mo ago
Thank you!

I think you may have accidentally switched low and high in #2, no? The abstract speaks of high curvature as associated with memorization:

> curvature for memorized training points is much sharper than non memorized

radarsat1•3mo ago
This sounds more correct to me. I've read previously somewhere that better generalization is usually associated with wider, smoother minima, and this is why regularization is important, because it has a smoothing function on the loss landscape.
getnormality•3mo ago
Yes. This is also not hard to see intuitively from scratch.

Say you have a smooth but highly flexible model y = f(x) and some data points you are fitting with a machine learning algorithm. For whatever reason, the algorithm decides it wants to reduce training error by interpolating some specific point, (x0,y0), without negatively affecting training error on nearby points. The direct, guaranteed successful way to do this is to adjust the model to y0 = f(x0) exactly on x0 by adding a Dirac delta there, leaving the rest of f exactly as-is. But this cannot be done on a differentiable model, as it would create a discontinuity. The next best thing that such a model can actually do is replace the Dirac delta with a smooth but very narrow bump (e.g. Gaussian). But this narrow bump will inevitably have extremely high curvature at x0, since the bump is flat at x0 and it has to merge with the neighborhood around x0 in a very short distance.

Think of driving: if you have to change lanes in a very short distance, you're going to have to steer hard. Steering is curvature.

woadwarrior01•3mo ago
That's very reminiscent of the idea behind the SAM (Sharpness Aware Minimization) family of optimizers.
andy12_•3mo ago
Actually, no! Look at this in the paper

> In extending from studying per-example to bulk memorization, we propose a novel inversion of the previous interpretation of loss curvature: while individual memorized points are associated with high curvature, the direction of curvature varies across examples, meaning that, averaged across multiple examples, memorization directions are actually flatter than generalizing directions, which maintain a consistent moderate curvature across points

getnormality•3mo ago
Ah! I figured I should be very circumspect in the question since I hadn't read in full and there could be some crazy reason it's actually the opposite.
vatsachak•3mo ago
The decomposition they use "averages out the points of high curvature" therefore those components of the decomposition which correspond to "higher curvature" are those components which are used across multiple data points. Therefore they are the "general reasoning"
kingstnap•3mo ago
A very similar idea is presented here in the first 5 minutes of this recent talk. But more from observing a kink in loss curves.

https://youtu.be/UyK3DgWY7yw?si=NN3f9Erik8o_Nfbs

NitpickLawyer•3mo ago
> Our work enhances the understanding of memorization in neural networks with practical applications towards removing it

Cool stuff. In a recent podcast Karpathy was also talking about this. He sees this as the next "target": models that don't memorise, because you can look it up in an oracle, but still keep the "reasoning" qualities.

esafak•3mo ago
How can you generalize without facts? They are the foundation on which generalization is built. Like programming without memorizing the keywords. Unless you make a distinction between facts that let you generalize, and facts that do not, like random ID numbers.
icandoit•3mo ago
We want the LLM to learn the multiplication algorithm not an incomplete set of tables. The algorithm might be smaller and will be more complete.

Honestly, our technology has outpaced our epistemology. So we don't really know what a fact is or isn't. Are facts what we call our supervised learning experiences? You think the sun rises, no the earth spins. Your belief that the sun rises helps you predict sunset and sunrise. Your belief would be quaint to someone born and raised on a space station. Apollos chariot moves the sun across the sky doesn't it?

esafak•3mo ago
There is a related line of work that suggests spikes in the ESD are related to the generalization vs memorization too; e.g.,

From Spikes to Heavy Tails: Unveiling the Spectral Evolution of Neural Networks (https://openreview.net/pdf?id=DJHB8eBUnt)