frontpage.

Show HN: I hate paying for GPUs while developing – this is how I solved it

https://adithyask.medium.com/write-deep-learning-code-locally-and-run-on-gpus-instantly-6f173104b334

1•Adithya-Kolavi•2h ago

So let's face it, if you’re doing anything with deep learning, GPUs are essential.

They’re expensive, and setting up infrastructure is painful. Most of the time, your GPUs sit idle while you’re coding, yet you still pay for uptime when scripts fail on the first try.

I ran into this as a “GPU poor” researcher. Tasks like downloading datasets or transforming data don’t need a GPU, but traditional setups force you to use one. Cloud setups don’t help—VMs with GPUs require manual environment setup, CUDA installations, or Docker containers just to get started. Multi-GPU training adds more headaches: not all images support NCCL, so communication between nodes can fail

At my research lab[1], we run experiments across model training, synthetic data generation, and RL. We needed a setup that was flexible, reliable, easy to use and collaborate.

When I went looking for a solution that would let me write code locally and run it on GPUs instantly, without worrying about infrastructure, multi-node setups, or idle GPU time, I stumbled upon Modal [2], and after a year of using it, it’s been a game-changer: it increases our research throughput and productivity, saves a ton on GPU costs and infrastructure management, and allows us to ship really fast.

I’ve compiled everything we’ve learned into this blog + hands-on tutorial [3], with three examples showing different ways to use Modal: rapidly develop on GPUs, deploy at scale, and do it all without breaking a sweat over infrastructure.

Here’s what we cover in the blog: - Wrapping existing code to run on Modal’s serverless infrastructure. - Handling datasets on Modal with volumes for seamless access. - Writing training scripts using Unsloth and Axolotl for easy fine-tuning. - Serving models in a scalable, high-throughput way with vLLM.

By the end, you’ll know how to write and experiment locally and run on GPUs instantly—no idle bills, no complex environment setup, no multi-node headaches.

[1] https://cognitivelab.in [2] https://modal.com [3] https://aiengineering.academy/LLM/ServerLessFinetuning/

How One AI Model Creates a Physical Intuition of Its Environment

Phoenix Creator Argues Elixir Is AIs Best Language

Show HN: Volant– spin up real microVMs in 10 seconds(Docker images or initramfs)

Property-based testing of batch-invariant operations

Show HN: DidMySettingsChange – A tool that checks changed windows settings

Stop Using (Only) GitHub Releases

Premature Generalization

Amateur soccer headers and brain health: Study finds changes within brain folds

Our Paint – A Natural Painting Program

Larger Than RAM Vector Indexes for Relational Databases

Redis 101: From a Beginners POV

The principles of extreme fault tolerance

ESA inaugurates new deep space antenna in Australia

AOL's dialup internet service ended after 34 years

Jay Graber Confuses the Gas and Brake Pedals

Historian uses AI to help identify Nazi in notorious Holocaust murder image

Mortality

Startups on hard mode: Founding in Europe

Bat: Cat with Syntax Highlighting

Germany Embraces Balkonkraftwerke – Balcony Solar for Apartments – CleanTechnica

Cleverness or Reps

Power to the People: Plug-In Solar Now Legal in Utah Homes – CleanTechnica

Universal Tool Calling Protocol (UTCP)

Fast, Cheap, Good: Choose Three

Is the Human Washing Machine of Our Dreams Becoming a Reality?

The Price of Tomorrow

Ken Parker, famed luthier, has died

I built a minimalist Jekyll theme inspired by Fabien Sanglard's website

The agony and ecstasy of restoring a Fra Angelico masterpiece

How AI is shaking up the study of earthquakes