Local LLM for Private Companies

2•ahendest•1h ago

what do you think to host a local LLM for company businness? ok we have ms365 subscription and built-in copilot for learn procedur, and Q/A. but to use the llm internal systems like special department softwares, which contains sensitive data, i do not have any idea other than local LLM. but so far what i found is gpus are expensive. gpt told me i can use 36B local LLM with rtx 6000 pro 96gb which costs 12k$. and i think the normal gpt that we are using on browser, gpt 5.4 extended think is much more stronger than 36B llm. i am curious if there are others that have similar ideas, or any proffessional advices.

Comments

julia-kafarska•1h ago

I run Qwen 35B on my local machine daily but also over 200B params with flash-moe occasionally. In today's world, with all the open models spending a lot of money make sense if your needs a bigger then couple of people.

ahendest•8m ago

how is your token/s for qwen and for flash-moe? and what system you are using? and do you satisfied on them? thanks for reply!!

TSMC Delays Use of ASML's High-NA EUV Machines over Cost Concerns

SQLite Official Vec1 Extension – Vector Search

Cyber agencies share fresh advice to defend against China-linked covert networks

Ombre – open-source AI infrastructure

Show HN: Features Goldmine – Features Engineering Made Easy

The U.S. Military Is Running a Bitcoin Node, Admiral Says

How big is the Wander Console Network?

Crypto scam lures ships into Strait of Hormuz

Cognometry: The Measurement of Machine Cognition

Show HN: Game Server Backend – backend services for multiplayer games

98% of all recent environmental claims can be categorized as "greenwashing"

Creative Enfeeblement

Show HN: LazyAgent – All in one observerbility TUI app for coding agents

HN Blackout Poetry

Oshkosh council rescinds Flock camera contract after 'false statements'

AngelList Announces USVC

Gone but Not Forgotten: Recovering the Dead Web

AI bias can creep into online content moderation

Stop being honest in job interviews

List of predictions for autonomous Tesla vehicles by Elon Musk

Capital Without Labour: A Founder's Guide to the Changing VC Equation

Can you MoCap a parkour video?

Agents CLI in Agent Platform: Create to Production in One CLI Blog

Milla Jovovich's AI memory claims to beat all paid ones. Benchmarks disagree

Python security headers without copy-paste policy drift

SideQuick – A free desktop app to stop abandoning side projects

ShadowStrike-Labs: ShadowStrike Phantom Home UI

Kubernetes release that kills Ingress Nginx, contains Japanese poetry and art

Per-Project Terminal Colours

Open Source Claude Mythos Reconstruction