frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

AI Overviews are killing the web search, and there's nothing we can do about it

https://www.neowin.net/editorials/ai-overviews-are-killing-the-web-search-and-theres-nothing-we-c...
2•bundie•5m ago•0 comments

City skylines need an upgrade in the face of climate stress

https://theconversation.com/city-skylines-need-an-upgrade-in-the-face-of-climate-stress-267763
2•gnabgib•6m ago•0 comments

1979: The Model World of Robert Symes [video]

https://www.youtube.com/watch?v=HmDxmxhrGDc
1•xqcgrek2•10m ago•0 comments

Satellites Have a Lot of Room

https://www.johndcook.com/blog/2026/02/02/satellites-have-a-lot-of-room/
2•y1n0•10m ago•0 comments

1980s Farm Crisis

https://en.wikipedia.org/wiki/1980s_farm_crisis
3•calebhwin•11m ago•1 comments

Show HN: FSID - Identifier for files and directories (like ISBN for Books)

https://github.com/skorotkiewicz/fsid
1•modinfo•16m ago•0 comments

Show HN: Holy Grail: Open-Source Autonomous Development Agent

https://github.com/dakotalock/holygrailopensource
1•Moriarty2026•23m ago•1 comments

Show HN: Minecraft Creeper meets 90s Tamagotchi

https://github.com/danielbrendel/krepagotchi-game
1•foxiel•31m ago•1 comments

Show HN: Termiteam – Control center for multiple AI agent terminals

https://github.com/NetanelBaruch/termiteam
1•Netanelbaruch•31m ago•0 comments

The only U.S. particle collider shuts down

https://www.sciencenews.org/article/particle-collider-shuts-down-brookhaven
2•rolph•33m ago•1 comments

Ask HN: Why do purchased B2B email lists still have such poor deliverability?

1•solarisos•34m ago•2 comments

Show HN: Remotion directory (videos and prompts)

https://www.remotion.directory/
1•rokbenko•36m ago•0 comments

Portable C Compiler

https://en.wikipedia.org/wiki/Portable_C_Compiler
2•guerrilla•38m ago•0 comments

Show HN: Kokki – A "Dual-Core" System Prompt to Reduce LLM Hallucinations

1•Ginsabo•39m ago•0 comments

Software Engineering Transformation 2026

https://mfranc.com/blog/ai-2026/
1•michal-franc•40m ago•0 comments

Microsoft purges Win11 printer drivers, devices on borrowed time

https://www.tomshardware.com/peripherals/printers/microsoft-stops-distrubitng-legacy-v3-and-v4-pr...
3•rolph•40m ago•1 comments

Lunch with the FT: Tarek Mansour

https://www.ft.com/content/a4cebf4c-c26c-48bb-82c8-5701d8256282
2•hhs•43m ago•0 comments

Old Mexico and her lost provinces (1883)

https://www.gutenberg.org/cache/epub/77881/pg77881-images.html
1•petethomas•47m ago•0 comments

'AI' is a dick move, redux

https://www.baldurbjarnason.com/notes/2026/note-on-debating-llm-fans/
5•cratermoon•48m ago•0 comments

The source code was the moat. But not anymore

https://philipotoole.com/the-source-code-was-the-moat-no-longer/
1•otoolep•48m ago•0 comments

Does anyone else feel like their inbox has become their job?

1•cfata•48m ago•1 comments

An AI model that can read and diagnose a brain MRI in seconds

https://www.michiganmedicine.org/health-lab/ai-model-can-read-and-diagnose-brain-mri-seconds
2•hhs•52m ago•0 comments

Dev with 5 of experience switched to Rails, what should I be careful about?

2•vampiregrey•54m ago•0 comments

AlphaFace: High Fidelity and Real-Time Face Swapper Robust to Facial Pose

https://arxiv.org/abs/2601.16429
1•PaulHoule•55m ago•0 comments

Scientists discover “levitating” time crystals that you can hold in your hand

https://www.nyu.edu/about/news-publications/news/2026/february/scientists-discover--levitating--t...
3•hhs•57m ago•0 comments

Rammstein – Deutschland (C64 Cover, Real SID, 8-bit – 2019) [video]

https://www.youtube.com/watch?v=3VReIuv1GFo
1•erickhill•57m ago•0 comments

Tell HN: Yet Another Round of Zendesk Spam

5•Philpax•57m ago•1 comments

Postgres Message Queue (PGMQ)

https://github.com/pgmq/pgmq
1•Lwrless•1h ago•0 comments

Show HN: Django-rclone: Database and media backups for Django, powered by rclone

https://github.com/kjnez/django-rclone
2•cui•1h ago•1 comments

NY lawmakers proposed statewide data center moratorium

https://www.niagara-gazette.com/news/local_news/ny-lawmakers-proposed-statewide-data-center-morat...
2•geox•1h ago•0 comments
Open in hackernews

Ask HN: How the same LLM "instance" serve multiple clients?

1•BiraIgnacio•9mo ago
I've been playing with running LLMs locally and only then realized I have no idea how to scale it (I don't really know how LLMs work internally).

I'm assuming context is everything but if the same LLM process can serve multiple clients, aren't there risks of mixing contexts? Does anyone have any ideas?

Comments

sherdil2022•9mo ago
Let me ChatGPT for you:

Good question. Let’s break it down carefully.

When you hear about a single LLM instance serving multiple clients at the same time, it usually works like this: • The LLM instance is stateless: Each client sends a request (prompt + settings), the model processes that one request independently, and returns the response. The LLM doesn’t “remember” between requests unless you explicitly include conversation history in the prompt. • Concurrency is handled by infrastructure: Even though the LLM is “one model,” it can handle many incoming requests because the backend (server) wraps the model with techniques like: • Asynchronous request handling (e.g., using async/await patterns) • Batching: multiple prompts are packed together into a single forward pass through the model (very common in high-traffic servers) • Parallelism: the server could have multiple workers/replicas of the model (copies or shared GPUs) running side-by-side. • Queueing: if too many clients at once, requests are queued and processed in order. • Memory isolation: Each request is kept separate in memory. No client’s data leaks into another client’s conversation unless you (the app developer) accidentally introduce a bug.

So:

It’s not that one model is “locked” into serving only one person at a time. It’s more like the model is a very fast function being called many times in parallel.

⸻