frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

There Will Be a Scientific Theory of Deep Learning

https://arxiv.org/abs/2604.21691
58•jamie-simon•3h ago

Comments

adzm•1h ago
I'm only partially through this paper, but it's written in a very engaging and thoughtful manner.

There is so much to digest here but it's fascinating seeing it all put together!

4b11b4•56m ago
wow.. this would be cool. Instead of just.. guessing "shapes"
NitpickLawyer•28m ago
tbf, we've learned (ha!) more from smashing teeny tiny particles and "looking" at what comes out than from say 40 years of string theory. Sometimes doing stuff works, and the theory (hopefully) follows.
RyanShook•31m ago
Here's where I'm missing understanding: for decades the idea of neural networks had existed with minimal attention. Then in 2017 Attention Is All You Need gets released and since then there is an exponential explosion in deep learning. I understand that deep learning is accelerated by GPUs but the concept of a transformer could have been used on much slower hardware much earlier.
BigTTYGothGF•28m ago
The modern neural net revival got kicked off long before 2017.
noosphr•16m ago
Alex net in 2012 is only 5 years earlier.
embedding-shape•26m ago
> I understand that deep learning is accelerated by GPUs but the concept of a transformer could have been used on much slower hardware much earlier

But they don't give the same results at those smaller scales. People imagined, but no one could have put into practice because the hardware wasn't there yet. Simplified, LLMs is basically Transformers with the additional idea of "and a shitton of data to learn from", and for making training feasible with that amount of data, you do need some capable hardware.

teekert•22m ago
If you are in the radiology field it started “exploding” much earlier, with CNNs.
whateverboat•20m ago
The same thing happened with matrices. We had matrices for 400 years, but the field of linear algebra and especially numerical linear algebra exploded only with advent of computers.

In olden days, the correct way to solve a linear system of equations was to use theory of minors. With advent of computers, you suddenly had a huge theory of gaussian elimination, or Krylov spaces and what not.

wslh•20m ago
Don't understimate the massive data you need to make those networks tick. Also, impracticable in slow training algorithms, beyond if they were in GPUs or CPUs.
pash•13m ago
The inflection point was 2012, when AlexNet [0], a deep convolutional neural net, achieved a step-change improvement in the ImageNet classification competition.

After seeing AlexNet’s results, all of the major ML imaging labs switched to deep CNNs, and other approaches almost completely disappeared from SOTA imaging competitions. Over the next few years, deep neural networks took over in other ML domains as well.

The conventional wisdom is that it was the combination of (1) exponentially more compute than in earlier eras with (2) exponentially larger, high-quality datasets (e.g., the curated and hand-labeled ImageNet set) that finally allowed deep neural networks to shine.

0. https://en.wikipedia.org/wiki/AlexNet

cgearhart•12m ago
A much earlier major win for deep learning was AlexNet for image recognition in 2012. It dominated the competition and within a couple years it was effectively the only way to do image tasks. I think it was Jeremy Howard who wrote a paper around 2017 wondering when we’d get a transfer learning approach that worked as well for NLP as convnets did for images. The attention paper that year didn’t immediately dominate. The hardware wasn’t good enough and there wasn’t consensus on belief that scale would solve everything. It took like five more years before GPT3 took off and started this current wave.

I also think you might be discounting exactly how much compute is used to train these monsters. A single 1ghz processor would take about 100,000,000 years to train something in this class. Even with on the order of 25k GPUs training GPT3 size models takes a couple months. The anemic RAM on GPUs a decade ago (I think we had k80 GPUs with 12GB vs 100’s of GBs on H100/H200 today) and it was actually completely impossible to train a large transformer model prior to the early 2020s.

I’m even reminded how much gamers complained in the late 2010s about GPU prices skyrocketing because of ML use.

amelius•28m ago
"A New Kind of Science" ...
UltraSane•8m ago
I think we need the equivalent of general relativity for latent spaces.

Google Plans to Invest Up to $40B in Anthropic

https://www.bloomberg.com/news/articles/2026-04-24/google-plans-to-invest-up-to-40-billion-in-ant...
64•elffjs•5h ago•156 comments

My audio interface has SSH enabled by default

https://hhh.hn/rodecaster-duo-fw/
82•hhh•2h ago•18 comments

Iliad fragment found in Roman-era mummy

https://www.thehistoryblog.com/archives/75877
34•wise_blood•2d ago•1 comments

Sabotaging projects by overthinking, scope creep, and structural diffing

https://kevinlynagh.com/newsletter/2026_04_overthinking/
324•alcazar•7h ago•79 comments

The Classic American Diner

https://blogs.loc.gov/picturethis/2026/04/the-classic-american-diner/
92•NaOH•2h ago•44 comments

Tell HN: Claude 4.7 is ignoring stop hooks

27•LatencyKills•2h ago•5 comments

Work with the garage door up

https://notes.andymatuschak.org/Work_with_the_garage_door_up
67•jxmorris12•3d ago•63 comments

There Will Be a Scientific Theory of Deep Learning

https://arxiv.org/abs/2604.21691
63•jamie-simon•3h ago•15 comments

Diatec, known for its mechanical keyboard brand FILCO, has ceased operations

https://gigazine.net/gsc_news/en/20260424-filco-diatec/
72•gslin•5h ago•23 comments

How to be anti-social – a guide to incoherent and isolating social experiences

https://nate.leaflet.pub/3mk4xkaxobc2p
270•calcifer•11h ago•265 comments

Show HN: I've built a nice home server OS

https://lightwhale.asklandd.dk/
3•Zta77•17m ago•0 comments

SFO Quiet Airport (2025)

https://viewfromthewing.com/san-francisco-airport-removed-90-minutes-of-daily-noise-travelers-say...
101•CaliforniaKarl•3h ago•54 comments

I cancelled Claude: Token issues, declining quality, and poor support

https://nickyreinert.de/en/2026/2026-04-24-claude-critics/
693•y42•6h ago•404 comments

OpenAI releases GPT-5.5 and GPT-5.5 Pro in the API

https://developers.openai.com/api/docs/changelog
167•arabicalories•3h ago•99 comments

Email could have been X.400 times better

https://buttondown.com/blog/x400-vs-smtp-email
89•maguay•1d ago•72 comments

CC-Canary: Detect early signs of regressions in Claude Code

https://github.com/delta-hq/cc-canary
29•tejpalv•4h ago•10 comments

SDL Now Supports DOS

https://github.com/libsdl-org/SDL/pull/15377
196•Jayschwa•5h ago•70 comments

Spinel: Ruby AOT Native Compiler

https://github.com/matz/spinel
287•dluan•13h ago•79 comments

CSS as a Query Language

https://evdc.me/blog/css-query
48•evnc•4h ago•16 comments

I'm done making desktop applications (2009)

https://www.kalzumeus.com/2009/09/05/desktop-aps-versus-web-apps/
126•claxo•6h ago•148 comments

Different Language Models Learn Similar Number Representations

https://arxiv.org/abs/2604.20817
83•Anon84•7h ago•35 comments

MacBook Neo and how the iPad should be

https://craigmod.com/essays/ipad_neo/
157•jen729w•1d ago•86 comments

DeepSeek v4

https://api-docs.deepseek.com/
1756•impact_sy•18h ago•1351 comments

Show HN: Browser Harness – Gives LLM freedom to complete any browser task

https://github.com/browser-use/browser-harness
64•gregpr07•7h ago•26 comments

Physicists revive 1990s laser concept to propose a next-generation atomic clock

https://phys.org/news/2026-04-physicists-revive-1990s-laser-concept.html
46•wglb•21h ago•6 comments

TIPSv2: Advancing Vision-Language Pretraining with Enhanced Patch-Text Alignment

https://gdm-tipsv2.github.io/
12•gmays•2h ago•1 comments

Could a Claude Code routine watch my finances?

https://driggsby.com/blog/claude-code-routine-watch-my-finances
46•mbm•2h ago•54 comments

Google Flow Music

https://www.flowmusic.app/
5•hmokiguess•57m ago•2 comments

Show HN: HNswered – watches for replies to your Hacker News posts and comments

https://github.com/adam-s/HNswered
7•dataviz1000•2h ago•3 comments

ML supports existence of unrecognized transient astronomical phenomena

https://arxiv.org/abs/2604.18799
53•solarist•7h ago•41 comments