frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Let's Talk About the AI Bubble []

1•aaraujo002•48s ago•0 comments

Show HN: I built a tool to answer any League of Legends E-sports data questions

https://query.new/
1•XavierPladevall•2m ago•0 comments

Job cuts surge in worst October layoffs in 22 years

https://www.usatoday.com/story/money/2025/11/06/october-job-cuts-surge-worst-layoffs/87127775007/
1•speckx•2m ago•0 comments

MiseryMap

https://www.flightaware.com/miserymap/
1•sndean•3m ago•0 comments

Testing-MCP – Write complex integration tests for web app

https://github.com/mcpland/testing-mcp
1•unadlib•3m ago•0 comments

Phison CEO claims NAND shortage could last a staggering 10 years

https://www.tomshardware.com/pc-components/ssds/phison-ceo-claims-nand-shortage-could-last-a-stag...
1•walterbell•3m ago•0 comments

Why even a US tech giant is launching 'sovereign support' for Europe now

https://www.zdnet.com/article/why-even-a-us-tech-giant-is-launching-sovereign-support-for-europe-...
1•CrankyBear•4m ago•0 comments

This Week in AI Agents: Agents Are Learning to Browse, Buy, and Negotiate

https://thisweekinaiagents.substack.com/p/agents-learning-to-browse-buy-negotiate
1•joaoaguiam•4m ago•0 comments

Tuning TLS: AES-256 Now Beats ChaCha20 on Every Modern CPU

https://ashvardanian.com/posts/chacha-vs-aes-2025/
2•ashvardanian•5m ago•0 comments

Perplexitys First Research Paper – Point-to-Point Communication for LLM Systems

https://arxiv.org/abs/2510.27656
1•Alifatisk•6m ago•0 comments

Show HN: Free analyzer that finds outdated content before it kills your traffic

https://freshrank.ai
2•maldinii•7m ago•0 comments

Big YouTube channels are being banned. YouTubers are blaming AI

https://mashable.com/article/big-youtube-channels-terminated-creators-blame-ai
2•mostcallmeyt•7m ago•1 comments

Wikipedia co-founder joins editing conflict over the Gaza genocide page

https://www.theverge.com/news/813245/wikipedia-co-founder-jimmy-wales-gaza-genocide
1•mostcallmeyt•8m ago•0 comments

Bolivia's new president rekindles cautious hope for long-stalled lithium dreams

https://www.reuters.com/world/china/bolivias-new-president-rekindles-cautious-hope-long-stalled-l...
1•wslh•8m ago•0 comments

That email address contains five or more consonants in a row

https://www.clintmcmahon.com/Blog/email-address-contains-five-or-more-consonants
1•speckx•8m ago•0 comments

EIA: North America's LNG Export Capacity Could More Than Double by 2029

https://oilprice.com/Latest-Energy-News/World-News/EIA-North-Americas-LNG-Export-Capacity-Could-M...
1•PaulHoule•10m ago•0 comments

IncusOS – immutable OS run incus

https://discuss.linuxcontainers.org/t/announcing-incusos/25139
1•xlmnxp•11m ago•1 comments

The Internet: How HTTP and TCP Work [video]

https://www.youtube.com/watch?v=hyhaeJIeQac
1•artisandip7•12m ago•0 comments

Making MCP Tool Calls Scriptable with mcp_cli

https://www.joshbeckman.org/blog/practicing/making-mcp-tool-calls-scriptable-with-mcpcli
1•bckmn•13m ago•0 comments

Federal Judge Survey: Warns of Judicial Crisis, Faults SCOTUS Emergency Orders

https://www.nytimes.com/2025/10/11/us/politics/judicial-crisis-supreme-court-trump.html
2•mmooss•13m ago•2 comments

Avoid hyperfocus and keep a healthy work pace (as a dad of two)

https://www.devas.life/my-plan-to-avoid-hyperfocus-and-keep-a-healthy-work-pace-as-a-dad-of-two/
1•cmpit•16m ago•0 comments

Checking for Spam Content with Chrome AI

https://www.raymondcamden.com/2025/11/07/checking-for-spam-content-with-chrome-ai
1•speckx•17m ago•0 comments

Apple's fight with Europe continues as it removes iPhone feature in EU

https://www.the-independent.com/tech/apple-iphone-wifi-sharing-password-europe-eu-b2860173.html
2•refp•17m ago•0 comments

I'm abandoning a techstack because of it's community (or lack there of)

1•fullstacking•18m ago•0 comments

Yellow's $137M-plus lawsuit against Teamsters revived

https://www.freightwaves.com/news/yellows-137m-plus-lawsuit-against-teamsters-revived
2•crescit_eundo•19m ago•0 comments

Who Wrote that headline? Maybe a Robot

https://www.nytimes.com/2025/11/07/business/media/ai-news-media.html
2•jbegley•20m ago•0 comments

Notes on Being a Man

https://www.profgalloway.com/notes-on-being-a-man/
10•Brajeshwar•22m ago•0 comments

World Appears on Track to Triple Renewable Capacity by 2030

https://e360.yale.edu/digest/triple-renewable-power-goal
2•Brajeshwar•22m ago•0 comments

Claude Skills Marketplace

https://skillsmp.com/#
1•Dowwie•24m ago•0 comments

Pure Go hardware accelerated local inference on VLMs using llama.cpp

https://github.com/hybridgroup/yzma
1•deadprogram•27m ago•0 comments
Open in hackernews

From Memorization to Reasoning in the Spectrum of Loss Curvature

https://arxiv.org/abs/2510.24256
33•andy12_•4h ago

Comments

andy12_•4h ago
Very concise summary of the procedure described in this paper:

1. Run the model once across a dataset to estimate loss curvature per MLP weight matrix via K-FAC (activation/gradient covariances).

2. Decompose each weight matrix into curvature-ordered components; low-curvature directions correspond most to verbatim memorization, higher curvature to shared/general mechanisms.

3. Edit by dropping the low-curvature subspace and keep only the top directions.

vessenes•4h ago
Thank you for this huge time saver.

Now, about the paper-that’s super interesting. I imagine the dream here is to distil down into a “reasoning” core. Or maybe reclaim space for more generalization. Lots of interesting use cases.

getnormality•4h ago
Thank you!

I think you may have accidentally switched low and high in #2, no? The abstract speaks of high curvature as associated with memorization:

> curvature for memorized training points is much sharper than non memorized

radarsat1•3h ago
This sounds more correct to me. I've read previously somewhere that better generalization is usually associated with wider, smoother minima, and this is why regularization is important, because it has a smoothing function on the loss landscape.
getnormality•3h ago
Yes. This is also not hard to see intuitively from scratch.

Say you have a smooth but highly flexible model y = f(x) and some data points you are fitting with a machine learning algorithm. For whatever reason, the algorithm decides it wants to reduce training error by interpolating some specific point, (x0,y0), without negatively affecting training error on nearby points. The direct, guaranteed successful way to do this is to adjust the model to y0 = f(x0) exactly on x0 by adding a Dirac delta there, leaving the rest of f exactly as-is. But this cannot be done on a differentiable model, as it would create a discontinuity. The next best thing that such a model can actually do is replace the Dirac delta with a smooth but very narrow bump (e.g. Gaussian). But this narrow bump will inevitably have extremely high curvature at x0, since the bump is flat at x0 and it has to merge with the neighborhood around x0 in a very short distance.

Think of driving: if you have to change lanes in a very short distance, you're going to have to steer hard. Steering is curvature.

woadwarrior01•2h ago
That's very reminiscent of the idea behind the SAM (Sharpness Aware Minimization) family of optimizers.
andy12_•2h ago
Actually, no! Look at this in the paper

> In extending from studying per-example to bulk memorization, we propose a novel inversion of the previous interpretation of loss curvature: while individual memorized points are associated with high curvature, the direction of curvature varies across examples, meaning that, averaged across multiple examples, memorization directions are actually flatter than generalizing directions, which maintain a consistent moderate curvature across points

getnormality•2h ago
Ah! I figured I should be very circumspect in the question since I hadn't read in full and there could be some crazy reason it's actually the opposite.
vatsachak•56m ago
The decomposition they use "averages out the points of high curvature" therefore those components of the decomposition which correspond to "higher curvature" are those components which are used across multiple data points. Therefore they are the "general reasoning"
kingstnap•3h ago
A very similar idea is presented here in the first 5 minutes of this recent talk. But more from observing a kink in loss curves.

https://youtu.be/UyK3DgWY7yw?si=NN3f9Erik8o_Nfbs

NitpickLawyer•1h ago
> Our work enhances the understanding of memorization in neural networks with practical applications towards removing it

Cool stuff. In a recent podcast Karpathy was also talking about this. He sees this as the next "target": models that don't memorise, because you can look it up in an oracle, but still keep the "reasoning" qualities.

esafak•33m ago
How can you generalize without facts? They are the foundation on which generalization is built. Like programming without memorizing the keywords. Unless you make a distinction between facts that let you generalize, and facts that do not, like random ID numbers.
esafak•24m ago
There is a related line of work that suggests spikes in the ESD are related to the generalization vs memorization too; e.g.,

From Spikes to Heavy Tails: Unveiling the Spectral Evolution of Neural Networks (https://openreview.net/pdf?id=DJHB8eBUnt)