Ask HN: Why is LLM training still GPU-hungry despite DeepSeek?

3•takinola•7h ago

When DeepSeek released R-1 everyone thought that signaled the end of the GPU-intensive LLM training approach. It does not appear to have worked out that way as GPU demand continues to grow unabated. What happened? Is the DeepSeek training method unreproducible or impractical in some way?

Comments

cratermoon•7h ago

The DeepSeek method requires spending money on very good programmers and giving them the tools and time to build out optimizations. The hype-driving LLM cycles and companies with multi-billion dollar valuations prioritize time-to-market and throw money at more and bigger GPUs to solve performance bottlenecks.

It's "impractical" if the goal is to make as much money as possible before the bubble pops.

PaulHoule•6h ago

See https://en.wikipedia.org/wiki/Jevons_paradox

Show HN: DeadliQ – AI-powered deadline tracking for your documents

Unmoved mover

Ask HN: Stylography, AI and an impending privacy nightmare?

US Government announces $200M Grok contract a week after 'MechaHitler'

The Tiny Teams Playbook

Microlasers Made from Edible Substances

Careless People (Review of the Book)

Detecting and reporting all unhandled C++ exceptions

ChatGPT made up a product feature out of thin air, so this company created it

House Republicans Vote to Block Release of Epstein Files

Survival of the Greediest

Read GitHub repos in one second in VSCode

SF Bay Area Aging Demographics

Energy expenditure and obesity across the economic spectrum

AI Breaking into Higher Dimension to Mimic Human Brain and Achieve Intelligence

Show HN: Tell the world why you unfollowed/muted a social media account

Is AI the end of coding as we know it, or just another tool?

WordPress Turmoil and the Fair Package Manager

The Pragmatic Engineer 2025 Survey: What's in your tech stack?

Graph Continuous Thought Machines

Show HN: McClane – Done-for-you lead drops from Facebook group conversations

Silicon Valley, à la Française

Energy expenditure and obesity across the economic spectrum

TikTok Creator Sued by Sylvanian Doll Maker over Brand Promotions

Ask HN: Time to Pivot Out of Engineering?

Where's Firefox Going Next?

Views of the U.S. have worsened while opinions of China have improved in surveys

BlackRock hit by $52B withdrawal from single client

Tried Comet: Impressive AI Tool with Concerns About Future Risks

Power-seeking, by any person, may be equivalent to minimizing uncertainty

Ask HN: Why is LLM training still GPU-hungry despite DeepSeek?

Comments

Show HN: DeadliQ – AI-powered deadline tracking for your documents

Unmoved mover

Ask HN: Stylography, AI and an impending privacy nightmare?

US Government announces $200M Grok contract a week after 'MechaHitler'

The Tiny Teams Playbook

Microlasers Made from Edible Substances

Careless People (Review of the Book)

Detecting and reporting all unhandled C++ exceptions

ChatGPT made up a product feature out of thin air, so this company created it

House Republicans Vote to Block Release of Epstein Files

Survival of the Greediest

Read GitHub repos in one second in VSCode

SF Bay Area Aging Demographics

Energy expenditure and obesity across the economic spectrum

AI Breaking into Higher Dimension to Mimic Human Brain and Achieve Intelligence

Show HN: Tell the world why you unfollowed/muted a social media account

Is AI the end of coding as we know it, or just another tool?

WordPress Turmoil and the Fair Package Manager

The Pragmatic Engineer 2025 Survey: What's in your tech stack?

Graph Continuous Thought Machines

Show HN: McClane – Done-for-you lead drops from Facebook group conversations

Silicon Valley, à la Française

Energy expenditure and obesity across the economic spectrum

TikTok Creator Sued by Sylvanian Doll Maker over Brand Promotions

Ask HN: Time to Pivot Out of Engineering?

Where's Firefox Going Next?

Views of the U.S. have worsened while opinions of China have improved in surveys

BlackRock hit by $52B withdrawal from single client

Tried Comet: Impressive AI Tool with Concerns About Future Risks

Power-seeking, by any person, may be equivalent to minimizing uncertainty