As a backup in case future access changes, I'd like to have sota LLM weights and a machine that can run queries with them in reserve. OpenAI releasing weights is as good a time as any to actually do it. My question is what hardware setup would you buy, that is reasonable accessible (say under 5k, ideally well under) and can do a good job running the models for local queries. And, if it matters, could be suitable for archiving, say until either there is a substantial advance rendering today's LLMs obsolete or until needed because good, open weights aren't available anymore.
Comments
bigyabai•16h ago
The issue isn't getting LLM weights. Llama taught us that it's basically impossible to prevent people from distributing them if they choose to. You don't have to worry about "future access changes" too much, OpenAI knows there is no "undo" button once they publish weights.
The real issue is having SOTA, collapse-proof hardware for inference. Apple and Nvidia hardware are both reliant on drivers that can brick your server with an over-the-air update. AMD hardware has generally more resilient Mesa drivers that can theoretically survive a hostile OEM, but with fewer options for finetuning and training. Intel GPUs are a high-VRAM option but it's unclear how long they'll be supported in-software for. Everything is a system of tradeoffs.
bigyabai•16h ago
The real issue is having SOTA, collapse-proof hardware for inference. Apple and Nvidia hardware are both reliant on drivers that can brick your server with an over-the-air update. AMD hardware has generally more resilient Mesa drivers that can theoretically survive a hostile OEM, but with fewer options for finetuning and training. Intel GPUs are a high-VRAM option but it's unclear how long they'll be supported in-software for. Everything is a system of tradeoffs.