news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Reproducing the deep double descent paper

https://stpn.bearblog.dev/reproducing-double-descent/

15•stpn•1d ago

Comments

davidguetta•1d ago

is this not because the longer you train, the more neurons 'die' (not uilized anymore cause the gradient is flat on the dataset) so you effectively get a smaller models as the training goes on ?

rsfern•1d ago

I don’t think so? the double descent phenomenon also occurs in linear models under the right conditions. My understanding of this is that when the effective model capacity is exactly equal to the information in the dataset, there is only one solution that interpolates the training data perfectly, but when the capacity increases far beyond this there are many such interpolating solutions. Apply enough regularization and you are likely to find an interpolating solution that generalizes well

stpn•1d ago

(post author here)

I was curious about this since it kind of makes sense, but I offer a few reasons why I think this isn't the case:

- In the 10% noise case at least, the second descent eventually finds a minima that's better than the original local minima which suggests to me the model really is finding a better fit rather than just reducing itself to a similar smaller model

- If it were the case, I think we might also expect the error for larger models to converge to the performance of smaller models? But instead they converge lower and better

- I checked the logged gradient histograms I had for a the runs. While I'm still learning how to interpret the results, I didn't see signs of vanishing gradients where dead neurons later in the model prevented earlier layers from learning. Gradients do get smaller over time but that seems expected and we don't have big waves of neurons dying which is what I'd expect to have the larger network converge on the size of the smaller one.

lcrmorin•14h ago

Do you change regularisation ?

Why Philosophy of Physics?

https://aeon.co/essays/why-do-philosophy-of-physics-when-you-can-do-physics-itself

1•Caiero•5m ago•0 comments

An ancient river landscape preserved beneath the East Antarctic Ice Sheet (2023)

https://www.nature.com/articles/s41467-023-42152-2

1•walterbell•32m ago•0 comments

Farewell, NOAA-18

https://cimss.ssec.wisc.edu/satellite-blog/archives/65190

4•austinallegro•37m ago•1 comments

I built a knowledge system that gives AI perfect codebase memory

https://github.com/Muvon/octocode

2•donhardman•38m ago•0 comments

Goa Gajah

https://en.wikipedia.org/wiki/Goa_Gajah

2•sans_souse•40m ago•0 comments

Aaron Hsu – Do Programming Language Features Deliver on Their Promises [video]

https://www.youtube.com/watch?v=V8sACAhg4vM

2•diimdeep•40m ago•0 comments

Ultrahuman Home

https://www.ultrahuman.com/home/

2•cfcfcf•44m ago•0 comments

Keeping the Web Up Under the Weight of AI Crawlers

https://www.eff.org/deeplinks/2025/06/keeping-web-under-weight-ai-crawlers

1•doener•47m ago•0 comments

AImmerse Web App

1•yiyih•58m ago•0 comments

Is Jordan Peterson Just Making It Up as He Goes?

https://thewalrus.ca/is-jordan-peterson-just-making-it-up-as-he-goes/

7•sameers•1h ago•3 comments

The FAIR Package Manager: Decentralized WordPress infrastructure

https://joost.blog/path-forward-for-wordpress/

27•twapi•1h ago•5 comments

Fair aims to decentralize WordPress.org services, backed by Linux Foundation

https://www.therepository.email/fair-to-decentralize-wordpress-backed-by-linux-foundation-and-contributors

8•ValentineC•1h ago•0 comments

How to Run Webinars

https://blog.engora.com/2023/07/how-to-run-webinars.html

3•Vermin2000•1h ago•0 comments

Private Equity-Owned Companies Pocket Class Action Payouts

https://www.forbes.com/sites/jeffkauflin/2025/05/21/how-private-equity-owned-companies-quietly-pocket-class-action-payouts/

3•walterbell•1h ago•0 comments

Tesla AI VP Milan Kovac Resigns After 9 Years Leading FSD and Optimus Projects

https://gearmusk.com/2025/06/07/tesla-ai-vp-milan-kovac-resigns/

53•loog5566•1h ago•11 comments

Show HN: The 5-minutes Competitor Analysis

https://www.ycompetitor.com/

2•rubeekrumpet•1h ago•0 comments

I built an Image Splitter tool in under an hour using ChatGPT

https://tools.techchee.com/image-tools/image-splitter

2•ketyung•2h ago•1 comments

DeepSeek-R1-0528 Did Not Have a Moment

https://thezvi.substack.com/p/deepseek-r1-0528-did-not-have-a-moment

4•paulpauper•2h ago•2 comments

What Happens When People Don't Understand How AI Works

https://www.theatlantic.com/culture/archive/2025/06/artificial-intelligence-illiteracy/683021/

4•paulpauper•2h ago•0 comments

Ask HN: Do we need a language designed specifically for AI code generation?

2•baijum•2h ago•0 comments

Good pixel art can be one-shotted by AI now

https://gametorch.app/collections/7

5•gametorch•2h ago•3 comments

I dream of roombas: 1000s of automated AI robots that autonomously maintain code

https://ghuntley.com/ktlo/

5•ghuntley•2h ago•5 comments

China Kicks Off Human Testing of Implantable Brain-Computer Interface Devices

https://www.yicaiglobal.com/news/china-kicks-off-human-testing-of-implantable-brain-computer-interface-devices

2•gametorch•2h ago•0 comments

Why are front end dev demand so high if front end development is easier? (2012)

https://simonwillison.net/2012/Feb/13/why-are-front-end/

16•thunderbong•2h ago•8 comments

A Novel "Reasoning"-Enhancing Technique for Large Language Models

https://marqcodes.com

2•N3Xxus_6•2h ago•2 comments

Astonishing discovery by computer scientist: how to squeeze space into time [video]

https://www.youtube.com/watch?v=p_AW6fomKPI

3•drhodes•2h ago•0 comments

Show HN: Resumable Web Streams

https://github.com/vercel/resumable-stream

3•cramforce•2h ago•0 comments

AMC Says It Will Show More Ads Before Movies

https://www.nytimes.com/2025/06/06/business/movies-theaters-ads-amc.html

9•cebert•3h ago•12 comments

Getting C++ Hello World working on Windows (a comedy & tragedy)

https://sdegutis.github.io/blog/creating-cpp-hello-world.html

4•90s_dev•3h ago•2 comments

NASA delays next flight of Boeing's alternative to SpaceX Dragon

https://theedgemalaysia.com/node/758199

9•bookmtn•3h ago•2 comments