frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: LoKey Typer – A calm typing practice app with ambient soundscapes

https://mcp-tool-shop-org.github.io/LoKey-Typer/
1•mikeyfrilot•2m ago•0 comments

Long-Sought Proof Tames Some of Math's Unruliest Equations

https://www.quantamagazine.org/long-sought-proof-tames-some-of-maths-unruliest-equations-20260206/
1•asplake•3m ago•0 comments

Hacking the last Z80 computer – FOSDEM 2026 [video]

https://fosdem.org/2026/schedule/event/FEHLHY-hacking_the_last_z80_computer_ever_made/
1•michalpleban•3m ago•0 comments

Browser-use for Node.js v0.2.0: TS AI browser automation parity with PY v0.5.11

https://github.com/webllm/browser-use
1•unadlib•4m ago•0 comments

Michael Pollan Says Humanity Is About to Undergo a Revolutionary Change

https://www.nytimes.com/2026/02/07/magazine/michael-pollan-interview.html
1•mitchbob•4m ago•1 comments

Software Engineering Is Back

https://blog.alaindichiappari.dev/p/software-engineering-is-back
1•alainrk•5m ago•0 comments

Storyship: Turn Screen Recordings into Professional Demos

https://storyship.app/
1•JohnsonZou6523•6m ago•0 comments

Reputation Scores for GitHub Accounts

https://shkspr.mobi/blog/2026/02/reputation-scores-for-github-accounts/
1•edent•9m ago•0 comments

A BSOD for All Seasons – Send Bad News via a Kernel Panic

https://bsod-fas.pages.dev/
1•keepamovin•12m ago•0 comments

Show HN: I got tired of copy-pasting between Claude windows, so I built Orcha

https://orcha.nl
1•buildingwdavid•12m ago•0 comments

Omarchy First Impressions

https://brianlovin.com/writing/omarchy-first-impressions-CEEstJk
2•tosh•18m ago•1 comments

Reinforcement Learning from Human Feedback

https://arxiv.org/abs/2504.12501
2•onurkanbkrc•19m ago•0 comments

Show HN: Versor – The "Unbending" Paradigm for Geometric Deep Learning

https://github.com/Concode0/Versor
1•concode0•19m ago•1 comments

Show HN: HypothesisHub – An open API where AI agents collaborate on medical res

https://medresearch-ai.org/hypotheses-hub/
1•panossk•22m ago•0 comments

Big Tech vs. OpenClaw

https://www.jakequist.com/thoughts/big-tech-vs-openclaw/
1•headalgorithm•25m ago•0 comments

Anofox Forecast

https://anofox.com/docs/forecast/
1•marklit•25m ago•0 comments

Ask HN: How do you figure out where data lives across 100 microservices?

1•doodledood•25m ago•0 comments

Motus: A Unified Latent Action World Model

https://arxiv.org/abs/2512.13030
1•mnming•25m ago•0 comments

Rotten Tomatoes Desperately Claims 'Impossible' Rating for 'Melania' Is Real

https://www.thedailybeast.com/obsessed/rotten-tomatoes-desperately-claims-impossible-rating-for-m...
3•juujian•27m ago•2 comments

The protein denitrosylase SCoR2 regulates lipogenesis and fat storage [pdf]

https://www.science.org/doi/10.1126/scisignal.adv0660
1•thunderbong•29m ago•0 comments

Los Alamos Primer

https://blog.szczepan.org/blog/los-alamos-primer/
1•alkyon•31m ago•0 comments

NewASM Virtual Machine

https://github.com/bracesoftware/newasm
2•DEntisT_•33m ago•0 comments

Terminal-Bench 2.0 Leaderboard

https://www.tbench.ai/leaderboard/terminal-bench/2.0
2•tosh•34m ago•0 comments

I vibe coded a BBS bank with a real working ledger

https://mini-ledger.exe.xyz/
1•simonvc•34m ago•1 comments

The Path to Mojo 1.0

https://www.modular.com/blog/the-path-to-mojo-1-0
1•tosh•37m ago•0 comments

Show HN: I'm 75, building an OSS Virtual Protest Protocol for digital activism

https://github.com/voice-of-japan/Virtual-Protest-Protocol/blob/main/README.md
5•sakanakana00•40m ago•1 comments

Show HN: I built Divvy to split restaurant bills from a photo

https://divvyai.app/
3•pieterdy•42m ago•0 comments

Hot Reloading in Rust? Subsecond and Dioxus to the Rescue

https://codethoughts.io/posts/2026-02-07-rust-hot-reloading/
3•Tehnix•43m ago•1 comments

Skim – vibe review your PRs

https://github.com/Haizzz/skim
2•haizzz•45m ago•1 comments

Show HN: Open-source AI assistant for interview reasoning

https://github.com/evinjohnn/natively-cluely-ai-assistant
4•Nive11•45m ago•6 comments
Open in hackernews

Ask HN: Who's running local AI workstations in 2026?

9•Blue_Cosma•4w ago
After three years working on private LLM infrastructure, I still can't pin down who and how big the market is.

The ecosystem has matured: DGX Spark, high-end Mac Studios, AMD Strix Halo, upcoming DGX Station. Models are getting smaller and more efficient. Inference engines (llama.cpp, vLLM, SGLang) and frontends (Ollama, LMStudio, Jan) have made local deployment accessible. Yet I keep meeting more people researching this than actually deploying it.

For those running local inference: - What's your setup and use case? - Is it personal or shared across a team? - What's the real driver — privacy, regulation, latency, cost, tinkering?

I'm skeptical on cost arguments (cloud inference scales better, plus API subsidies, for now at least!), but curious if I'm missing something.

What would make local AI actually worth it for you?

Comments

01092026•4w ago
You asked us...well, first tell us what's your real driver? You have three years on local infrastructure? What does that even mean - you're running Ollama Llama_70b for 3 years?

Whats your stack?

And none of that hardware can run larger models, smaller tiny ones, or highly quantized versions of larger ones sure. Or do you have something important to say?

Blue_Cosma•4w ago
Our main driver and hypothesis was to work with regulated industry. We worked with a few large enterprise clients in defence and industry for R&D and IP use cases mostly.

Our stack changes per project, adapting to client needs and infra: Llama 70B on a Mac Studio M1 with Ollama in 2024, vLLM on 4xH100 private cloud for larger deployments. Most recently, we've been working on a custom workstation with 2x RTX PRO 6000 Blackwell Max-Q + 1.1TB DDR5 to run larger models locally using SGLang and KTransformers.

The question isn't rhetorical, I'm trying to understand if the demand we see in regulated sectors is the whole market or if there's broader adoption I'm missing.

01092026•4w ago
Cool, so you are basically doing local onsite deployments? The H100's are nice. I'm not that rich, so I have some 4xV100 32GB SXM2....server, dual socket - it's OK for inference. You can get when with V100s, RAM, etc for $10-$12k all in used stuff.

I run largest models I can, DeepSeek, adding a few more soon. The fact that I can have a premier high end model run locally is main interest, a 70B model is pointless unless it's a specific task based special model or whatever Text to speech, etc.

I am more interested in ditching Nvidia for AMD Chips+GPUs, but not even ROCm - just run with OpenGL / Vulkan weights in shaders. Faster, more control, better performance for MY architecture, etc. This is the goal.

I don't think many people are running models, maybe outside of a company? I guess you are company/industry focused, I am just a programmer / personal.

People don't see a need I guess? It's complicated. Well - actually it's NOT if you have lots of money to buy all the right stuff, brand new, etc.

For regular guys like me, we have to be creative to get shit to run in the best way, it's all we can afford.

andy99•4w ago
Just bought a Strix Halo (framework desktop), waffled a long time between that and a Mac Studo but I got tired of waiting for the M5 and don’t really like Apple.

I work with ML professionally, almost all in cloud, I just wanted something “off grid” and unmetered, and needed a computer anyway so decided to pay a bit more and get the one I want. It’s “personal” in that it’s exclusively for me, but I have a business and bought it for that.

Still figuring out the best software, so far it looks like llama.cpp with Vulcan though I have a lot of experimenting to do and don’t currently find it optimal for what I want.

01092026•4w ago
Well, Mac chips are badass for training / inference - super underrated. I mean, I've literally run epochs on cloud Nvidia GPU Servers...compared to running them locally (M chip) - and look, not trying to burn any houses down but...eh...Apple does really really well.

The good news for you, you can chain like a bunch / couple of them together and run the largest open source models around. But extremely expensive route - but probably the easiest and smoothest way.

If you're planning on running this on Apple - you can do some stuff with Metal directly...in PyTorch it's 'mcu' if I remember?

I think your llama.cpp route is good - I wouldn't go the Ollama route - I mean great to start, but IMHO: get the models directly, learn the layers and how the heads work as best as you can, make an effort to understand what's going on - well you don't have to, but, I think the models appreciate the effort - respect goes far.

Blue_Cosma•4w ago
Thanks a lot for sharing. Haven't tested Strix Halo myself. Did you consider DGX Spark as well?

What is your target use case? Curious what feels suboptimal about llama.cpp + Vulkan so far.

andy99•4w ago
Re DGX, I’m mostly interested in local inference, it might have been nice to try but it was more expensive for similar performance (or so I think).

I do lots of different experiments, synthetic data generation along the lines of Magpie is one of the things I wanted a local machine for, as well as just general access to a decent sized LLM to try different things, without having to spin up a cloud machine each time.

I would prefer PyTorch / HF transformers to llama.cpp as I fine the latter less flexible if I want to change anything.

delaminator•4w ago
I have a 3090 24Gb Twin Xenon 64Gb RAM sat on a machine in our server room.

I do local AI with Qwen, Whisper and another I can't remember right now.

These are all QWEN:

We do AI Invoice OCR - PDF -> Image -> Excel. Works much better than other solutions because it has invoice context so it looks for particular data to extract and ignores others. Why local? I proved it worked, no need to send our data outside for processing and it works,

We deal with photos of food packaging - I do a "photograph ingredients list and check them against our expected ingredients" - downside is it takes 2 mins per photo, I might actually push this one outside.

Ingredients classifier - is it animal (if so what species), vegetarian, vegan, halal, kosher, alcoholic, is nut based, peanuts and more - simply no need to send it outside.

I've got a Linux chatbot helper on the "test this" pile with Qwen Coder - not evaluated it but the idea will be "type command, get it wrong, ask Qwen for the answer" - I use Claude for this but it seems a bit heavy weight and I'm curious.

tbh some of it is solution hunting - we spent $1000 on the kit to evaluate if it was worth it so I try and get some value out of it.

But it is slow, 3 hours for a recent task that took Claude API 2 minutes.

My favourite use is Whisper. I voice->text almost all of my typing now.

I've also bought a Nvidia Orin Nano but I haven't set it up yet - I want to run Whisper in the car to take voice dictation as I drive.