Ask HN: Is anyone using AMD GPUs for their AI workloads?

6•technoabsurdist•7mo ago

^ title. I've been renting MI300Xs coz they are cheaper than H100s and my experience has been generally OK (smoother than i expected based on people shitting on AMD so much online). ROCm 6.x seems decent out of the box now, and I'll happily spend 30 more minutes setting up my GPU if it means 20% cheaper. that being said, it's still annoying to run inference for LLMs on AMD's hardware (e.g. You have to install vLLM from source). And there are some other small details which still suck. As a small example, nvidia-smi gives you a nice clear interface while rocm-smi dumps 3 pages of context that's hard to navigate.

would be curious to hear experiences from other folks experimenting with AI workloads.

Comments

dlcarrier•7mo ago

I'm using an MI25, flashed as a PRO WX 9100, which requires an older version of ROCm to work. That's expectedm, because my GPU is depricated in future versions of ROCm, but what irks me is that everything neural network related barely works. You need the exact version of every interpreter and library, which ends up working on some distributions but not others. I've noticed that when people program in compiled languages, they seem to make a concerted effort to do some kind of bounds testing, but anything in Python or Node.js seems to be released as soon as it kind-of-sort-of works, some of the time.

technoabsurdist•7mo ago

oh yeah, in my experience anything below ROCm6.x really sucks.

I tried to run qwen2.5-32B on ROCm5.x and it was running at <15tok/s lol.

Have you tried running any sort of LLM inference on your MI25, or what NN workloads are you running?

KV Cache Transform Coding for Compact Storage in LLM Inference

A quantitative, multimodal wearable bioelectronic device for stress assessment

Why Big Tech Is Throwing Cash into India in Quest for AI Supremacy

How to shoot yourself in the foot – 2026 edition

Eight More Months of Agents

From Human Thought to Machine Coordination

The new X API pricing must be a joke

Show HN: RMA Dashboard fast SAST results for monorepos (SARIF and triage)

Show HN: Source code graphRAG for Java/Kotlin development based on jQAssistant

Python Only Has One Real Competitor

Tmux to Zellij (and Back)

Ask HN: How are you using specialized agents to accelerate your work?

Passing user_id through 6 services? OTel Baggage fixes this

DavMail Pop/IMAP/SMTP/Caldav/Carddav/LDAP Exchange Gateway

Visual data modelling in the browser (open source)

Show HN: Tharos – CLI to find and autofix security bugs using local LLMs

Oddly Simple GUI Programs

The New Playbook for Leaders [pdf]

Interactive Unboxing of J Dilla's Donuts

OneCourt helps blind and low-vision fans to track Super Bowl live

Rudolf Vrba

Autism Incidence in Girls and Boys May Be Nearly Equal, Study Suggests

Wellness Hotels Discovery Application

NASA delays moon rocket launch by a month after fuel leaks during test

Sebastian Galiani on the Marginal Revolution

Ask HN: Are we at the point where software can improve itself?

Binance Gives Trump Family's Crypto Firm a Leg Up

Reverse engineering Chinese 'shit-program' for absolute glory: R/ClaudeCode

Indian Culture

Show HN: Maravel-Framework 10.61 prevents circular dependency