what do you think to host a local LLM for company businness? ok we have ms365 subscription and built-in copilot for learn procedur, and Q/A. but to use the llm internal systems like special department softwares, which contains sensitive data, i do not have any idea other than local LLM. but so far what i found is gpus are expensive. gpt told me i can use 36B local LLM with rtx 6000 pro 96gb which costs 12k$. and i think the normal gpt that we are using on browser, gpt 5.4 extended think is much more stronger than 36B llm. i am curious if there are others that have similar ideas, or any proffessional advices.
Comments
julia-kafarska•1h ago
I run Qwen 35B on my local machine daily but also over 200B params with flash-moe occasionally. In today's world, with all the open models spending a lot of money make sense if your needs a bigger then couple of people.
ahendest•8m ago
how is your token/s for qwen and for flash-moe? and what system you are using? and do you satisfied on them? thanks for reply!!
julia-kafarska•1h ago
ahendest•8m ago