frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Ask HN: Anyone is using Linux machine for local inference?

2•throwaw12•10h ago
Hey there,

Is anyone here using Linux machine with 256Gb or 512Gb RAM to run latest models locally?

I am considering buying a new laptop/desktop to run models locally. Most benchmarks I see are for Mac Mx series chips with MLX, even then for big models (>300B param) people are using quantized versions (3bit, 4bit) and its causing drop in quality.

If anyone used Linux with >256Gb ram and no dedicated GPU, how is your experience?

Comments

compressedgas•8h ago
Running LLMs on CPU only is too slow.
incomingpain•7h ago
Ive tried this with deepseek r1, i got about 2 tokens/second and each response took 10-15 minutes to reply.

The cost of that hardware was free to me, but to build this yourself would be thousands. You might as well just hit up an api: https://openrouter.ai/deepseek/deepseek-r1-0528/providers

Even if you hammer it, it'll only be $10.

>Most benchmarks I see are for Mac Mx series chips with MLX

Mac mini pro with 64gb of ram is actually suspiciously good value. Somehting like $4000... bit high but it can be your workstation.

The gpu and system memory are unified so you can load up bigger models. It's not the same speeds as high end gpus, but it's also not the same power draw. You'll stick to under 200watts.

Obviously 64GB doesnt let you run full deepseek or similar neither; but those 32B-70B models are ideal anyway.

At a bit cheaper price, there are minipcs with AMD Ryzen™ AI Max+ 395. Same idea as the mac mini; and you can get 64-128GB of ram. Intel has a similar chip.

You'll get 15-20 tokens/s from 32B. Which is slow if you're coding.

Now, you could look into high end gpus, get a server mobo with 10 pcie slots, load it up with 16GB cards. Have 160GB of vram. But you'll need special electrical plugs; it'll idle at like 600watts, costing $100/month. But man that thing would be great, so fast.

How JIT builds of CPython work

https://savannah.dev/posts/how-your-code-runs-in-a-jit-build/
1•mariuz•36s ago•0 comments

'bitchat? now on the App Store

https://twitter.com/jack/status/1949780445446512780
1•janandonly•2m ago•0 comments

Large-scale study uncovers 57 genetic hotspots into stuttering origins

https://news.vumc.org/2025/07/28/large-scale-study-defines-genetic-architecture-of-stuttering/
1•nixass•4m ago•0 comments

Innovation starts with consumers, not academia

https://lemire.me/blog/2025/07/16/innovation/
1•zffr•4m ago•0 comments

Toshiba MG11 series hard drive

https://storage.toshiba.com/enterprise-hdd/cloud-scale-capacity/mg11-series
1•AureliusMA•4m ago•1 comments

Debugging Hell: Spark Tomcat and Proxies

https://sumantopal07.medium.com/bug-which-took-months-to-debug-bbd12e0eba9d
1•Sumanto•5m ago•0 comments

Researcher is a relic term from academia – Elon Musk

https://twitter.com/elonmusk/status/1950254103474446728
1•amrrs•8m ago•0 comments

Microsoft Nears OpenAI Agreement for Ongoing Tech Access

https://www.bloomberg.com/news/articles/2025-07-29/microsoft-s-access-to-openai-tech-is-focus-of-contract-talks
1•mfiguiere•11m ago•0 comments

Predictive UX Engineering

https://travisbumgarner.dev/blog/photography-portfolio-performance
1•sillysideprojs•11m ago•0 comments

A Curated List of Awesome Honeypots

https://securehoney.net/awesome-honeypots.html
1•sugarpimpdorsey•11m ago•0 comments

DietPi released a new version v9.15

1•StephanStS•12m ago•0 comments

Google's June 2025 Core Update

https://www.searchenginejournal.com/googles-june-2025-update-analysis-what-just-happened/551501/
2•andrewstetsenko•14m ago•0 comments

Spotify stock falls on revenue miss, lackluster guidance

https://www.cnbc.com/2025/07/29/spotify-spot-stock-q2-2025-earnings.html
1•bundie•16m ago•0 comments

Microsoft bans LibreOffice developer's account without warning, rejects appeal

https://www.neowin.net/news/microsoft-bans-libreoffice-developers-account-without-warning-rejects-appeal/
11•bundie•18m ago•2 comments

Show HN: Gradient-Free ML Algorithm (Available for Contract Work)

1•atowns•18m ago•0 comments

Show HN: I Built a GitHub Action to Wait for Vercel Deployments Before CI

https://github.com/marketplace/actions/vercel-preview-url-with-status-polling
1•bakkerinho•19m ago•0 comments

New Generational Pomodoro

1•TheZBuilder•19m ago•0 comments

Big Tech Is the Only Winner of the [UK's] Online Safety Act

https://www.newstatesman.com/science-tech/big-tech/2025/07/big-tech-is-the-only-winner-of-the-online-safety-act
3•sealeck•20m ago•0 comments

GLM 4.5 one-shots a Full Coding Project

https://www.youtube.com/watch?v=3fbOQBTfemg
2•amrrs•21m ago•0 comments

One Year After Fisker's Bankruptcy, Ocean Owners Are Still Paying the Price

https://www.autoevolution.com/news/one-year-after-fisker-s-bankruptcy-ocean-owners-are-still-paying-the-price-252214.html
2•dangle1•21m ago•0 comments

Unleashing the Editing Superpower of Emacs

http://yummymelon.com/devnull/unleashing-the-editing-superpower-of-emacs.html
2•kickingvegas•25m ago•0 comments

Scamming Substack?

https://willstorr.substack.com/p/scamming-substack
1•exolymph•27m ago•0 comments

Ask HN: Why the fundamental skepticism around LLMs?

3•smokel•30m ago•6 comments

Private Equity in the Hospital Industry (2021)

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3924517
4•coloneltcb•31m ago•0 comments

Apple to Shutter 1st Retail Store in China

https://finance.yahoo.com/news/apple-shutter-retail-store-china-024446353.html
3•mgh2•31m ago•0 comments

SecureFlow Extension to Vibe Code Securely – Codepathfinder.dev

https://codepathfinder.dev/blog/introducing-secureflow-extension-to-vibe-code-securely/
3•shivasurya•31m ago•0 comments

HealthEquity to Replace Passwords with Passkeys

https://www.healthequity.com/library/replacing-passwords-with-passkeys
2•cbrews•32m ago•1 comments

Alcoholic Drink Names You're Probably Mispronouncing

https://www.mentalfloss.com/language/pronunciation/commonly-mispronounced-alcoholic-drink-names
3•Bluestein•33m ago•2 comments

Show HN: I built an API to generate PDF invoices from JSON

https://json2invoice.com
1•johnwisdom•34m ago•0 comments

Solving the "AI agent black box" problem with typed tasks

https://old.reddit.com/user/rusty-lunatic-lemon/comments/1mck25g/how_we_solved_the_ai_agent_black_box_problem_with/
1•handfuloflight•36m ago•0 comments