frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Ask HN: What Does Your Self-Hosted LLM Stack Look Like in 2025?

17•anditherobot•1d ago
Back when web development was taking off, there was always a go-to stack — something like Postgres + Django + jQuery, or .NET + Bootstrap, SQLITE. Over the years we had proven tech and proven patterns like : MVC, SPA etc...

Now that local LLMs are gaining traction, I’m wondering what the equivalent stack looks like today.

Models, Runtime, hardware and other tools.

That could rival the Claudes, ChatGPTs or Geminis, etc

Thanks

Comments

fazlerocks•1d ago
Running Llama 3.1 70B on 2x4090s with vLLM. Memory is a pain but works decent for most stuff.

Tbh for coding I just use the smaller ones like CodeQwen 7B. way faster and good enough for autocomplete. Only fire up the big model when I actually need it to think.

The annoying part is keeping everything updated, new model drops every week and half don't work with whatever you're already running.

bluejay2387•1d ago
2x 3090's running Ollama and VLLM... Ollama for most stuff and VLLM for the few models that I need to test that don't run on Ollama. Open Web UI as my primary interface. I just moved to Devstral for coding using the Continue plugin in VSCode. I use Qwen 3 32b for creative stuff and Flux Dev for images. Gemma 3 27b for most everything else (slightly less smart than Qwen, but its faster). Mixed Bread for embeddings (though apparently NV-Embed-v2 is better?). Pydantic as my main utility library. This is all for personal stuff. My stack at work is completely different and driven more by our Legal teams than technical decisions.
gabriel_dev•22h ago
Ollama + mac mini 24gb (inference)
runjake•22h ago
Ollama + M3 Max 36GB Mac. Usually with Python + SQLite3.

The models vary depending on the task. DeepSeek distilled has been a favorite for the past several months.

I use various smaller (~3B) models for simpler tasks.

xyc•21h ago
recurse.chat + M2 max Mac
v5v3•8h ago
Ollama on a M1 MacBook pro but will be moving to a Nvidia GPU setup.

Ask HN: Startup getting spammed with PayPal disputes, what should we do?

274•june3739•2d ago•176 comments

Ask HN: Anyone else feeling increasingly alienated from the industry?

15•saubeidl•6h ago•13 comments

Ask HN: Has anybody built search on top of Anna's Archive?

280•neonate•2d ago•146 comments

Ask HN: What are your fav/goto decision making hacks/heuristics?

4•ottaborra•7h ago•7 comments

Ask HN: Running AI agents in isolated environments

4•polycaster•8h ago•0 comments

Ask HN: Who is hiring? (June 2025)

365•whoishiring•3d ago•448 comments

Ask HN: How do I learn robotics in 2025?

395•srijansriv•4d ago•99 comments

Ask HN: Walking while working and having meetings

2•martythemaniak•12h ago•5 comments

Ask HN: How do I learn practical electronic repair?

180•juanse•6d ago•111 comments

Ask HN: Who's Using the Origin Private File System?

4•ChadNauseam•13h ago•1 comments

Ask HN: Anyone making a living from a paid API?

247•meander_water•6d ago•172 comments

Ask HN: Options for One-Handed Typing

92•Townley•2d ago•93 comments

Ask HN: Who wants to be hired? (June 2025)

125•whoishiring•3d ago•384 comments

Ask HN: What Does Your Self-Hosted LLM Stack Look Like in 2025?

17•anditherobot•1d ago•6 comments

Ask HN: What do you put in claude.md and what you leave out?

4•bognition•22h ago•2 comments

O(1) memory, no-preprocessing reachability algorithm for 2D grids

2•MatthiasGibis•20h ago•1 comments

Ask HN: What tools are you using for AI evals? Everything feels half-baked

4•fazlerocks•20h ago•2 comments

Ask HN: Where do you go for cutting-edge dev news and info?

2•TimTheTinker•22h ago•9 comments

Ask HN: What is the best LLM for consumer grade hardware?

238•VladVladikoff•1w ago•182 comments

Ask HN: Dealing with Vibe Coding Depression?

16•softirq•1d ago•22 comments

Ask HN: How are parents who program teaching their kids today?

101•laze00•4d ago•91 comments

Reaching my first 100 users without money or audience (at 10K users now)

32•felixheikka•3d ago•11 comments

How do you store and maintain your CV/resume over time?

10•xantin•1d ago•16 comments

Ask HN: List of skills to survive the AI tsunami

16•cookiemonsieur•1d ago•6 comments

Ask HN: What's with the repeated job posts on "Who's hiring"?

85•rafavento•2d ago•41 comments

Ask HN: Is Adrian Colyer of "The Morning Paper" fame ok?

10•yencabulator•14h ago•0 comments

Ask HN: Best way to get laid off

10•jakamm•1d ago•24 comments

Ask HN: Anyone using project management tools for personal projects?

4•localbuilder•1d ago•12 comments

Ask HN: Unexplainable Copilot Premium Requests

4•drrotmos•2d ago•2 comments

Ask HN: Resources for building AI agents for software development?

7•nadis•2d ago•3 comments