frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Tesla turbine-inspired structure generates electricity using compressed air

https://techxplore.com/news/2026-01-tesla-turbine-generates-electricity-compressed.html
1•PaulHoule•29s ago•0 comments

State Department deleting 17 years of tweets (2009-2025); preservation needed

https://www.npr.org/2026/02/07/nx-s1-5704785/state-department-trump-posts-x
1•sleazylice•32s ago•1 comments

Learning to code, or building side projects with AI help, this one's for you

https://codeslick.dev/learn
1•vitorlourenco•1m ago•0 comments

Effulgence RPG Engine [video]

https://www.youtube.com/watch?v=xFQOUe9S7dU
1•msuniverse2026•2m ago•0 comments

Five disciplines discovered the same math independently – none of them knew

https://freethemath.org
1•energyscholar•3m ago•1 comments

We Scanned an AI Assistant for Security Issues: 12,465 Vulnerabilities

https://codeslick.dev/blog/openclaw-security-audit
1•vitorlourenco•4m ago•0 comments

Amazon no longer defend cloud customers against video patent infringement claims

https://ipfray.com/amazon-no-longer-defends-cloud-customers-against-video-patent-infringement-cla...
1•ffworld•4m ago•0 comments

Show HN: Medinilla – an OCPP compliant .NET back end (partially done)

https://github.com/eliodecolli/Medinilla
2•rhcm•7m ago•0 comments

How Does AI Distribute the Pie? Large Language Models and the Ultimatum Game

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6157066
1•dkga•7m ago•1 comments

Resistance Infrastructure

https://www.profgalloway.com/resistance-infrastructure/
2•samizdis•12m ago•0 comments

Fire-juggling unicyclist caught performing on crossing

https://news.sky.com/story/fire-juggling-unicyclist-caught-performing-on-crossing-13504459
1•austinallegro•12m ago•0 comments

Restoring a lost 1981 Unix roguelike (protoHack) and preserving Hack 1.0.3

https://github.com/Critlist/protoHack
2•Critlist•14m ago•0 comments

GPS and Time Dilation – Special and General Relativity

https://philosophersview.com/gps-and-time-dilation/
1•mistyvales•17m ago•0 comments

Show HN: Witnessd – Prove human authorship via hardware-bound jitter seals

https://github.com/writerslogic/witnessd
1•davidcondrey•17m ago•1 comments

Show HN: I built a clawdbot that texts like your crush

https://14.israelfirew.co
2•IsruAlpha•19m ago•2 comments

Scientists reverse Alzheimer's in mice and restore memory (2025)

https://www.sciencedaily.com/releases/2025/12/251224032354.htm
1•walterbell•22m ago•0 comments

Compiling Prolog to Forth [pdf]

https://vfxforth.com/flag/jfar/vol4/no4/article4.pdf
1•todsacerdoti•24m ago•0 comments

Show HN: Cymatica – an experimental, meditative audiovisual app

https://apps.apple.com/us/app/cymatica-sounds-visualizer/id6748863721
1•_august•25m ago•0 comments

GitBlack: Tracing America's Foundation

https://gitblack.vercel.app/
3•martialg•25m ago•0 comments

Horizon-LM: A RAM-Centric Architecture for LLM Training

https://arxiv.org/abs/2602.04816
1•chrsw•26m ago•0 comments

We just ordered shawarma and fries from Cursor [video]

https://www.youtube.com/shorts/WALQOiugbWc
1•jeffreyjin•27m ago•1 comments

Correctio

https://rhetoric.byu.edu/Figures/C/correctio.htm
1•grantpitt•27m ago•0 comments

Trying to make an Automated Ecologist: A first pass through the Biotime dataset

https://chillphysicsenjoyer.substack.com/p/trying-to-make-an-automated-ecologist
1•crescit_eundo•31m ago•0 comments

Watch Ukraine's Minigun-Firing, Drone-Hunting Turboprop in Action

https://www.twz.com/air/watch-ukraines-minigun-firing-drone-hunting-turboprop-in-action
1•breve•32m ago•0 comments

Free Trial: AI Interviewer

https://ai-interviewer.nuvoice.ai/
1•sijain2•32m ago•0 comments

FDA intends to take action against non-FDA-approved GLP-1 drugs

https://www.fda.gov/news-events/press-announcements/fda-intends-take-action-against-non-fda-appro...
23•randycupertino•33m ago•14 comments

Supernote e-ink devices for writing like paper

https://supernote.eu/choose-your-product/
3•janandonly•35m ago•0 comments

We are QA Engineers now

https://serce.me/posts/2026-02-05-we-are-qa-engineers-now
1•SerCe•36m ago•0 comments

Show HN: Measuring how AI agent teams improve issue resolution on SWE-Verified

https://arxiv.org/abs/2602.01465
2•NBenkovich•36m ago•0 comments

Adversarial Reasoning: Multiagent World Models for Closing the Simulation Gap

https://www.latent.space/p/adversarial-reasoning
1•swyx•36m ago•0 comments
Open in hackernews

Ask HN: What Does Your Self-Hosted LLM Stack Look Like in 2025?

21•anditherobot•8mo ago
Back when web development was taking off, there was always a go-to stack — something like Postgres + Django + jQuery, or .NET + Bootstrap, SQLITE. Over the years we had proven tech and proven patterns like : MVC, SPA etc...

Now that local LLMs are gaining traction, I’m wondering what the equivalent stack looks like today.

Models, Runtime, hardware and other tools.

That could rival the Claudes, ChatGPTs or Geminis, etc

Thanks

Comments

fazlerocks•8mo ago
Running Llama 3.1 70B on 2x4090s with vLLM. Memory is a pain but works decent for most stuff.

Tbh for coding I just use the smaller ones like CodeQwen 7B. way faster and good enough for autocomplete. Only fire up the big model when I actually need it to think.

The annoying part is keeping everything updated, new model drops every week and half don't work with whatever you're already running.

bluejay2387•8mo ago
2x 3090's running Ollama and VLLM... Ollama for most stuff and VLLM for the few models that I need to test that don't run on Ollama. Open Web UI as my primary interface. I just moved to Devstral for coding using the Continue plugin in VSCode. I use Qwen 3 32b for creative stuff and Flux Dev for images. Gemma 3 27b for most everything else (slightly less smart than Qwen, but its faster). Mixed Bread for embeddings (though apparently NV-Embed-v2 is better?). Pydantic as my main utility library. This is all for personal stuff. My stack at work is completely different and driven more by our Legal teams than technical decisions.
gabriel_dev•8mo ago
Ollama + mac mini 24gb (inference)
runjake•8mo ago
Ollama + M3 Max 36GB Mac. Usually with Python + SQLite3.

The models vary depending on the task. DeepSeek distilled has been a favorite for the past several months.

I use various smaller (~3B) models for simpler tasks.

xyc•8mo ago
recurse.chat + M2 max Mac
v5v3•8mo ago
Ollama on a M1 MacBook pro but will be moving to a Nvidia GPU setup.
PaulShin•8mo ago
Great question. We're building Markhub, an AI-powered collaboration OS, and our stack is a hybrid one, because we believe the "best" model depends entirely on the task.

1. For Heavy, Complex Tasks (Summarization, Code Gen, Creative Work): We don't self-host. The performance of top-tier models is still unmatched. We use Gemini-based models via Google's Vertex AI. The reliability and raw power for complex reasoning are worth the API cost for these critical features.

2. For Fast, Specific, Private Tasks (Our Self-Hosted Stack): For smaller, high-frequency tasks like classifying feedback types or extracting specific keywords from a conversation, we use a self-hosted stack for speed and cost-efficiency.

Models: We use fine-tuned versions of smaller, open-source models like Llama 3 8B or Mistral 7B. They are incredibly fast and cost-effective for specific, repetitive tasks. Runtime/Orchestration: We use LangChain for chaining prompts and managing workflows. For serving the model, we're using a simple FastAPI server running in a Docker container. Hardware: We run this on a dedicated GPU instance (like an A10G on AWS/GCP) for inference. The cost is predictable and much lower than using a large model for every small task. My takeaway: The "go-to stack" in 2025 isn't one-size-fits-all. It's a pragmatic, hybrid approach using the bestin class cloud APIs for the heavy lifting, and deploying fast, fine-tuned open-source models for everything else.