news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Reproducing the deep double descent paper

https://stpn.bearblog.dev/reproducing-double-descent/

15•stpn•1d ago

Comments

davidguetta•1d ago

is this not because the longer you train, the more neurons 'die' (not uilized anymore cause the gradient is flat on the dataset) so you effectively get a smaller models as the training goes on ?

rsfern•1d ago

I don’t think so? the double descent phenomenon also occurs in linear models under the right conditions. My understanding of this is that when the effective model capacity is exactly equal to the information in the dataset, there is only one solution that interpolates the training data perfectly, but when the capacity increases far beyond this there are many such interpolating solutions. Apply enough regularization and you are likely to find an interpolating solution that generalizes well

stpn•1d ago

(post author here)

I was curious about this since it kind of makes sense, but I offer a few reasons why I think this isn't the case:

- In the 10% noise case at least, the second descent eventually finds a minima that's better than the original local minima which suggests to me the model really is finding a better fit rather than just reducing itself to a similar smaller model

- If it were the case, I think we might also expect the error for larger models to converge to the performance of smaller models? But instead they converge lower and better

- I checked the logged gradient histograms I had for a the runs. While I'm still learning how to interpret the results, I didn't see signs of vanishing gradients where dead neurons later in the model prevented earlier layers from learning. Gradients do get smaller over time but that seems expected and we don't have big waves of neurons dying which is what I'd expect to have the larger network converge on the size of the smaller one.

lcrmorin•15h ago

Do you change regularisation ?

I built a knowledge system that gives AI perfect codebase memory

https://github.com/Muvon/octocode

16•donhardman•1h ago•0 comments

The FAIR Package Manager: Decentralized WordPress infrastructure

https://joost.blog/path-forward-for-wordpress/

51•twapi•2h ago•8 comments

Researchers develop ‘transparent paper’ as alternative to plastics

https://japannews.yomiuri.co.jp/science-nature/technology/20250605-259501/

243•anigbrowl•9h ago•111 comments

The time bomb in the tax code that's fueling mass tech layoffs

https://qz.com/tech-layoffs-tax-code-trump-section-174-microsoft-meta-1851783502

764•booleanbetrayal•2d ago•509 comments

Falsehoods programmers believe about aviation

https://flightaware.engineering/falsehoods-programmers-believe-about-aviation/

178•cratermoon•9h ago•71 comments

Getting Past Procrastination

https://spectrum.ieee.org/getting-past-procastination

52•WaitWaitWha•4h ago•17 comments

Ziina (YC W21) the Series A fintech is hiring product engineers

https://ziina.notion.site/Senior-Backend-Engineer-8b6642ec52ac45869656c135e07c6e86

1•faisaltoukan•22m ago

How we decreased GitLab repo backup times from 48 hours to 41 minutes

https://about.gitlab.com/blog/2025/06/05/how-we-decreased-gitlab-repo-backup-times-from-48-hours-to-41-minutes/

398•immortaljoe•15h ago•166 comments

A year of funded FreeBSD development

https://www.daemonology.net/blog/2025-06-06-A-year-of-funded-FreeBSD.html

237•cperciva•11h ago•76 comments

Why are smokestacks so tall?

https://practical.engineering/blog/2025/6/3/why-are-smokestacks-so-tall

48•azeemba•6h ago•4 comments

Sharing everything I could understand about gradient noise

https://blog.pkh.me/p/42-sharing-everything-i-could-understand-about-gradient-noise.html

44•ux•16h ago•1 comments

Medieval Africans had a unique process for purifying gold with glass (2019)

https://www.atlasobscura.com/articles/medieval-african-gold

83•mooreds•9h ago•33 comments

Highly efficient matrix transpose in Mojo

https://veitner.bearblog.dev/highly-efficient-matrix-transpose-in-mojo/

93•timmyd•11h ago•29 comments

The Illusion of Thinking: Understanding the Limitations of Reasoning LLMs [pdf]

https://ml-site.cdn-apple.com/papers/the-illusion-of-thinking.pdf

157•amrrs•13h ago•74 comments

I Read All of Cloudflare's Claude-Generated Commits

https://www.maxemitchell.com/writings/i-read-all-of-cloudflares-claude-generated-commits/

97•maxemitchell•8h ago•64 comments

What “working” means in the era of AI apps

https://a16z.com/revenue-benchmarks-ai-apps/

63•Brysonbw•8h ago•39 comments

Good pixel art can be one-shotted by AI now

https://gametorch.app/collections/7

15•gametorch•3h ago•15 comments

Sandia turns on brain-like storage-free supercomputer

https://blocksandfiles.com/2025/06/06/sandia-turns-on-brain-like-storage-free-supercomputer/

170•rbanffy•15h ago•60 comments

NASA delays next flight of Boeing's alternative to SpaceX Dragon

https://theedgemalaysia.com/node/758199

12•bookmtn•4h ago•5 comments

A masochist's guide to web development

https://sebastiano.tronto.net/blog/2025-06-06-webdev/

201•sebtron•17h ago•24 comments

Workhorse LLMs: Why Open Source Models Dominate Closed Source for Batch Tasks

https://sutro.sh/blog/workhorse-llms-why-open-source-models-win-for-batch-tasks

57•cmogni1•12h ago•15 comments

Smalltalk, Haskell and Lisp

https://storytotell.org/smalltalk-haskell-and-lisp

68•todsacerdoti•10h ago•22 comments

Show HN: AI game animation sprite generator

https://www.godmodeai.cloud/ai-sprite-generator

67•lyogavin•11h ago•54 comments

Odyc.js – A tiny JavaScript library for narrative games

https://odyc.dev

200•achtaitaipai•17h ago•48 comments

Too Many Open Files

https://mattrighetti.com/2025/06/04/too-many-files-open

112•furkansahin•16h ago•87 comments

Wendelstein 7-X sets new fusion record

https://www.heise.de/en/news/Wendelstein-7-X-sets-new-fusion-record-10422955.html

143•doener•3d ago•22 comments

Series C and scale

https://www.cursor.com/en/blog/series-c

71•fidotron•14h ago•50 comments

Curate your shell history

https://esham.io/2025/05/shell-history

110•todsacerdoti•17h ago•67 comments

Meta: Shut down your invasive AI Discover feed

https://www.mozillafoundation.org/en/campaigns/meta-shut-down-your-invasive-ai-discover-feed-now/

467•speckx•15h ago•195 comments

What you need to know about EMP weapons

https://www.aardvark.co.nz/daily/2025/0606.shtml

122•flyingkiwi44•20h ago•156 comments