frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Ask HN: Are you running local LLMs? What are your key use cases?

5•briansun•2h ago
2025 feels like a breakout year for local models. Open‑weight releases are getting genuinely useful: from Google’s Gemma to recent *gpt‑oss* drops, the gap with frontier commercial models keeps narrowing for many day‑to‑day tasks.

Yet outside of this community, local LLMs still don’t seem mainstream. My hunch: *great UX and durable apps are still thin on the ground.*

If you are using local models, I’d love to learn from your setup and workflows. Please be specific so others can calibrate:

Model(s) & size: exact name/version, and quantization (e.g., Q4_K_M).

Runtime/tooling: e.g., Ollama, LM studio, etc.

Hardware: CPU/GPU details (VRAM/RAM), OS. If laptop/edge/home servers, mention that.

Workflows where local wins: privacy/offline, data security, coding, huge amount extraction, RAG over your files, agents/tools, screen capture processing—what’s actually sticking for you?

Pain points: quality on complex reasoning, context management, tool reliability, long‑form coherence, energy/thermals, memory, Windows/Mac/Linux quirks.

Favorite app today: the one you actually open daily (and why).

Wishlist: the app you wish existed.

Gotchas/tips: config flags, quant choices, prompt patterns, or evaluation snippets that made a real difference.

If you’re not using local models yet, what’s the blocker—setup friction, quality, missing integrations, battery/thermals, or just “cloud is easier”? Links are welcome, but what helps most is concrete numbers and anecdotes from real use.

A simple reply template (optional):

``` Model(s): Runtime/tooling: Hardware: Use cases that stick: Pain points: Favorite app: Wishlist: ```

Also curious how people think about privacy and security in practice. Thanks!

Comments

incomingpain•1h ago
Python coding is practically the only usecase for local for me.

Cloud llm are able to run 1 trillion parameters and have all of python knowledge in a transparent rag that's 100gbit or faster. Of course they'll be the bestest on the block.

But when the new GPT coding benchmarks only barely behind grok 4 or gpt5 with high reasoning.

>Model(s) & size: exact name/version, and quantization (e.g., Q4_K_M).

My most reliable setup is Devstral + openhands. unsloth Q6_K_XL, 85,000 context, flash attention, kcache and vcache quant at Q8.

Second most reliable. GPT-OSS-20B + opencode. Default MXFP4, I can only load up 31,000 context or it fails?(still plenty but hoping this bug gets fixed), you cant use flash attention or kv or v quantization or it becomes dumb as rocks. This harmony stuff is annoying.

Still preliminary, just got working today, but testing is really good. Qwen3-30b-a3b-thinking-2507 + roo code or qwencode, 80,000 context, unsloth q4_k_xl, flash attention, kcache and vcache quant at Q8.

>Runtime/tooling: e.g., Ollama, LM studio, etc.

LM studio. I need vulkan for my setup. rocm is just a pain in the ass. They need to support way more linux distros.

24gb vram.

How Much Would You Pay for a Sorority Rush Coach?

https://www.thecut.com/article/sorority-rush-coach-recruitment-consultants-cost.html
1•speckx•47s ago•0 comments

Show HN: Aegis – A framework for AI-governed software development

https://github.com/chavezabelino/aegis-framework
1•chavezabelino•1m ago•0 comments

Net zero, part 1: energy

https://splittinginfinity.substack.com/p/net-zero-part-1-energy
1•paulpauper•2m ago•0 comments

The Rise of Silicon Valley's Techno-Religion

https://www.nytimes.com/2025/08/04/technology/rationalists-ai-lighthaven.html
1•paulpauper•2m ago•0 comments

OpenAI is giving ChatGPT to the government for $1

https://www.cnbc.com/2025/08/06/openai-is-giving-chatgpt-to-the-government-for-1-.html
1•hbhakhra•3m ago•0 comments

Free Webinar: "Battle of the Bots: AI Agent Showdown" on Aug. 19 at 1pm ET

https://8thlight.com/events/battle-of-the-bots-ai-showdown
1•SixFeetUp•5m ago•1 comments

Show HN: Grow a Garden Stock Tracker

https://apps.apple.com/us/app/grow-a-garden-stock-tracker/id6749698531
1•incendies•5m ago•0 comments

What's in a Name?: Reflections of an Irrepressible Name Collector (book)

https://archive.org/details/whatsinnamerefle00dick
1•anarbadalov•7m ago•1 comments

Show HN: MCPBridge – Collaborate with Multiple AI Agents in VS Code

https://mcpbridge.dev
1•adamosk•8m ago•0 comments

Show HN: Spellbook, a system package manager written in Elixir

https://spell-book.run
1•tcmart14•9m ago•0 comments

Top Power Moves in AI Governance This Week

https://aigovernancelead.substack.com/p/the-top-5-ai-governance-power-moves-d11
1•adelementary•13m ago•0 comments

Some thoughts from my short trip to London

https://nandinfinitum.com/posts/some-thoughts-on-london/
1•nanfinitum•13m ago•0 comments

Bank Python

https://calpaterson.com/bank-python.html
2•RGBCube•17m ago•1 comments

Concurrent TLS connection segfault in x509 storage (regression on 3.0.17)

https://github.com/openssl/openssl/issues/28171
1•l2dy•19m ago•0 comments

Disable built-in DNS clients in Chromium based apps

https://saneef.com/blog/disable-built-in-dns-clients-in-chromium-based-apps/
1•speckx•20m ago•0 comments

Floats Don't Work for Storing Cents

https://www.moderntreasury.com/journal/floats-dont-work-for-storing-cents
8•mattmarcus•22m ago•2 comments

Tesla's Dojo supercomputer is DOA – now what?

https://www.theverge.com/tesla/756709/tesla-dojo-ai-talent-exodus-elon-musk
3•malshe•23m ago•0 comments

Celaut: A peer-to-peer architecture for software design and distribution

https://github.com/celaut-project/paradigm
1•kushti•23m ago•0 comments

DummyIDP: Test SAML and SCIM without setting up a full-blown identity provider

https://dummyidp.com/
1•noleary•23m ago•0 comments

MapYourGrid

https://MapYourGrid.org/
3•protontypes•23m ago•0 comments

Llmswap v1.5.0 – Added IBM watsonx support to my multi-LLM Python library

https://pypi.org/project/llmswap/
1•sreenathmenon•25m ago•1 comments

A Mate Selection Theory of Feminization

https://www.richardhanania.com/p/a-mate-selection-theory-of-feminization
3•gmays•25m ago•0 comments

Prohibition never works, but that didn't stop the UK's Online Safety Act

https://www.theregister.com/2025/08/08/opinion_column_osa/
4•CrankyBear•26m ago•0 comments

Why I'm excited about the Hierarchical Reasoning Model

https://medium.com/@causalwizard/why-im-excited-about-the-hierarchical-reasoning-model-8fc04851ea7e
3•cubefox•26m ago•0 comments

Trapping D-lactate from microbiota improves blood glucose, fatty liver disease

https://www.sciencedirect.com/science/article/pii/S1550413125003286
2•PaulHoule•28m ago•0 comments

Neovim Integration with Cursor Agent CLI

https://github.com/xTacobaco/cursor-agent.nvim
2•xTacobaco•29m ago•1 comments

Qron.ai

https://qron.ai
1•tobiasmacke•31m ago•1 comments

Why Remote Work Just Works (For Me)

https://megalomaniacbore.blogspot.com/2025/08/why-remote-work-just-works-for-me.html
2•speckx•31m ago•0 comments

Show HN: GPT-5 Document Retrieval – AI Assistant with Inline Citations

https://www.smartresearch-ai.com/
2•ben011•32m ago•0 comments

An LLM Codegen Hero's Journey

https://harper.blog/2025/04/17/an-llm-codegen-heros-journey/
2•aeontech•32m ago•0 comments