I built a dual RTX 3090 rig for local AI in 2025 (and lessons learned)

https://www.llamabuilds.ai/build/portable-25l-nvlinked-dual-3090-llm-rig

32•tensorlibb•3d ago

Comments

tensorlibb•3d ago

I'm a huge fan of OpenRouter and their interface for solid LLM's but I recently jumped into fine tuning / modifying my own vision models for FPV drone detection (just for fun) and my daily workstation and it's 2080 just wasn't good enough.

Even in 2025 it's cool how solid a setup dual 3090's still are. nvlink is an absolute must but it's incredibly powerful. I'm able to run the latest Mistral thinking models and relatively powerful yolo based VLM's like the ones RoboFlow is based on.

Curious if anyone else is still using 3090's or has feedback for scaling up to 4-6 3090s.

Thanks everyone ;)

CraigJPerry•1h ago

if it's just for detection would audio not be cheaper to process?

I'm imagining a cluster of directional microphones, and then i don't know if it's better to perform some sort of band pass filtering first since it's so computationally cheap or whether it's better to just feed everything into the model directly. No idea.

I guess my first thought was just sounds from a drone likely is detectable reliably at a greater distance than visual, they're so small and a 180 degree by 180 degree hemisphere of pixels is a lot to process.

Fun problem either wayway.

fxtentacle•1h ago

The 3090 are a sweet spot for training. It’s the first generation with seriously fast VRAM. And it’s the last generation before Nvidia blocked NVlink. If you need to copy parameters between GPUs during training, the 3090 can be up to 70% faster than 4090 or 5090. Because the latter two are limited by PCI express bandwidth.

jacquesm•1h ago

To be fair though, the 4090 and 5090 are much easier capable of saturating PCI express than the 3090 is, even at 4 lanes per card the 3090 rarely manages to saturate the links, it still handsomely pays off to split down to 4 lanes and add more cards.

I used:

https://c-payne.com/

Very high quality and manageable prices.

jacquesm•1h ago

I've built a rig with 14 of them. NVLink is not 'an absolute must', it can be useful depending on the model and the application software you use and whether you're training or inferring.

The most important figure is the power consumed per token generated. You can optimize for that and get to a reasonably efficient system, or you can maximize token generation speed and end up with two times the power consumption for very little gain. You also will likely need to have a way to get rid of excess heat and all those fans get loud. I stuck the system in my garage, that made the noise much more manageable.

breakds•1h ago

I am curious about the setup of 14 GPUs - what kind of platform (motherboard) do you use to support so many PCIe lanes? And do you even have a chassis? Is it rack-mounted? Thanks!

jacquesm•45m ago

I used a large supermicro server chassis, a dual Xeon motherboard with 7 8 lane PCI Express slots, all the ram it would take (bought second hand), splitters, four massive powersupplies. I extended the server chassis with aluminum angle riveted onto the base. It could be rack mounted but I'd hate to be the person lifting it in. The 3090s were a mix, 10 of the same type (small, and with blower style fans on them) and 4 much larger ones that were kind of hard to accommodate (much wider and longer). I've linked to the splitter board manufacturer in another comment in this thread. That's the 'hard to get' component but once you have those and good cables to go with them the remaining setup problems are mostly power and heat management.

vladgur•1h ago

I am exploring options just for fun.

a used 3090 is around $900 on ebay. a used rtx 6000 ADA is around $5k

4 3090s are slower at inference and worse at training than 1 rtx 6000.

4x3090 would consume 1400W at load.

Rtx 6000 would consume 300W at load.

If you god forbid live in California and your power averages 45 cents per kwh, 4x3090 would be $1500+ more per year to operate than a single RTX 6000[0]

[0] Back of the napkin/ChatGPT calculation of running the GPU at load for 8 hours per day.

Note: I own a pc with a 3090, but if i had to build an AI training workstation, i would seriously consider cost to operate and resale value(per component).

logicallee•54m ago

>I am exploring options just for fun.

Since you're exploring options just for fun, out of curiosity, would you rent it out whenever you're not using it yourself, so it's not just sitting idle? (Could be noisy and loud). You'd be able to use your computer for other work at the same time and stop whenever you wanted to use it yourself.

supermatt•18m ago

I guess it depends on what you want to do: You get half the RAM in the 6000 (48 @ $104/GB) vs 4x3090 (96 @ $37.5/GB).

cfn•4m ago

I have an A6000 and the main advantage over a 3090 cluster is the build simplicity and relative silence of the machine (it is also used as my main dev workstation).

username12349•1h ago

total cost?

jszymborski•1h ago

It's written quite large on the page, just over 3K

bigiain•1h ago

It says $3090 (maybe easy to miss since it also talks about RTX 3090s?)

jszymborski•1h ago

I just don't get why the RTX 4090 is still so expensive on the used market. New Rtx 5090s are almost as expensive!

renewiltord•1h ago

They're dropping. I'm trying to offload 8x 4090s and I'll average $1500 I think.

tayo42•50m ago

Are these just for ai now? Or are games pushing video cards that much?

suladead•48m ago

I built pretty much this exact rig myself, but now it's gathering dust, any other uses for this rather than localLLMS

DaSHacka•44m ago

vidya

AJRF•41m ago

Those GPUs are so close to each other, doesn’t the heat cause instability?

deevus•24m ago

I’m really interested in this space from an AI sovereignty pov. Is it feasible for SMB/SME to use a box like in the article to get offline analysis of their data? It doesn’t have the worry of sending it off to the cloud.

I wanted to speak with businesses in my local area but no one took me up on it.

Deepen5•13m ago

is it that easy to get started?

Nine Things I Learned in Ninety Years

Altoids by the Fistful

Zoxide: A Better CD Command

Delete FROM users WHERE location = 'Iran';

Qwen3-Omni: Native Omni AI model for text, image and video

Fall Foliage Map 2025

Telli (YC F24) is hiring ambitious engineers [Berlin, on-site]

I built a dual RTX 3090 rig for local AI in 2025 (and lessons learned)

Gamebooks and graph theory (2019)

Paper2Agent: Stanford Reimagining Research Papers as Interactive AI Agents

Based C++

Cap'n Web: a new RPC system for browsers and web servers

I'm spoiled by Apple Silicon but still love Framework

X server implementation for SIXEL-featured terminals (2010-2014)

The Beginner's Textbook for Fully Homomorphic Encryption

Why haven't local-first apps become popular?

Is a movie prop the ultimate laptop bag?

Notion 3.0

Rungis: The Market and the City – A day at Europe's largest fresh food market

Testing is better than data structures and algorithms

What happens when coding agents stop feeling like dialup?

The common sense unit of work

Fine-grained HTTP filtering for Claude Code

After 50 years, The Magic Circle finally inducts Penn and Teller

OpenAI and Nvidia announce partnership to deploy 10GW of Nvidia systems

Germicidal UV could make airborne diseases as rare as those carried by water

Easy Forth (2015)

Show HN: Python Audio Transcription: Convert Speech to Text Locally

PlanetScale for Postgres is now GA

A board member's perspective of the RubyGems controversy

I built a dual RTX 3090 rig for local AI in 2025 (and lessons learned)

Comments

Nine Things I Learned in Ninety Years

Altoids by the Fistful

Zoxide: A Better CD Command

Delete FROM users WHERE location = 'Iran';

Qwen3-Omni: Native Omni AI model for text, image and video

Fall Foliage Map 2025

Telli (YC F24) is hiring ambitious engineers [Berlin, on-site]

I built a dual RTX 3090 rig for local AI in 2025 (and lessons learned)

Gamebooks and graph theory (2019)

Paper2Agent: Stanford Reimagining Research Papers as Interactive AI Agents

Based C++

Cap'n Web: a new RPC system for browsers and web servers

I'm spoiled by Apple Silicon but still love Framework

X server implementation for SIXEL-featured terminals (2010-2014)

The Beginner's Textbook for Fully Homomorphic Encryption

Why haven't local-first apps become popular?

Is a movie prop the ultimate laptop bag?

Notion 3.0

Rungis: The Market and the City – A day at Europe's largest fresh food market

Testing is better than data structures and algorithms

What happens when coding agents stop feeling like dialup?

The common sense unit of work

Fine-grained HTTP filtering for Claude Code

After 50 years, The Magic Circle finally inducts Penn and Teller

OpenAI and Nvidia announce partnership to deploy 10GW of Nvidia systems

Germicidal UV could make airborne diseases as rare as those carried by water

Easy Forth (2015)

Show HN: Python Audio Transcription: Convert Speech to Text Locally

PlanetScale for Postgres is now GA

A board member's perspective of the RubyGems controversy