I’ve been using Modal for everything — training, fine-tuning, evaluation, and even serving models. It lets me write code locally and run it instantly on serverless GPUs — no Kubernetes, no VM headaches, no idle GPU bills. I can attach persistent volumes for datasets and model weights, scale from 1 to 8 GPUs, and only pay for what I use.
Over time, I learned how to effectively use Modal for real-world ML workflows. And I’ve put everything we learned into hands-on tutorials that detail the full process:
Training nanoGPT from scratch by andrej karpathy Fine-tuning and deploying Gemma 3-4B with UnslothAI Multi-GPU Llama 8–70B training with axolotl_ai