frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Atlas Airborne (Boston Dynamics and RAI Institute) [video]

https://www.youtube.com/watch?v=UNorxwlZlFk
1•lysace•55s ago•0 comments

Zen Tools

http://postmake.io/zen-list
1•Malfunction92•3m ago•0 comments

Is the Detachment in the Room? – Agents, Cruelty, and Empathy

https://hailey.at/posts/3mear2n7v3k2r
1•carnevalem•3m ago•0 comments

The purpose of Continuous Integration is to fail

https://blog.nix-ci.com/post/2026-02-05_the-purpose-of-ci-is-to-fail
1•zdw•5m ago•0 comments

Apfelstrudel: Live coding music environment with AI agent chat

https://github.com/rcarmo/apfelstrudel
1•rcarmo•6m ago•0 comments

What Is Stoicism?

https://stoacentral.com/guides/what-is-stoicism
3•0xmattf•7m ago•0 comments

What happens when a neighborhood is built around a farm

https://grist.org/cities/what-happens-when-a-neighborhood-is-built-around-a-farm/
1•Brajeshwar•7m ago•0 comments

Every major galaxy is speeding away from the Milky Way, except one

https://www.livescience.com/space/cosmology/every-major-galaxy-is-speeding-away-from-the-milky-wa...
2•Brajeshwar•7m ago•0 comments

Extreme Inequality Presages the Revolt Against It

https://www.noemamag.com/extreme-inequality-presages-the-revolt-against-it/
2•Brajeshwar•7m ago•0 comments

There's no such thing as "tech" (Ten years later)

1•dtjb•8m ago•0 comments

What Really Killed Flash Player: A Six-Year Campaign of Deliberate Platform Work

https://medium.com/@aglaforge/what-really-killed-flash-player-a-six-year-campaign-of-deliberate-p...
1•jbegley•9m ago•0 comments

Ask HN: Anyone orchestrating multiple AI coding agents in parallel?

1•buildingwdavid•10m ago•0 comments

Show HN: Knowledge-Bank

https://github.com/gabrywu-public/knowledge-bank
1•gabrywu•16m ago•0 comments

Show HN: The Codeverse Hub Linux

https://github.com/TheCodeVerseHub/CodeVerseLinuxDistro
3•sinisterMage•17m ago•2 comments

Take a trip to Japan's Dododo Land, the most irritating place on Earth

https://soranews24.com/2026/02/07/take-a-trip-to-japans-dododo-land-the-most-irritating-place-on-...
2•zdw•17m ago•0 comments

British drivers over 70 to face eye tests every three years

https://www.bbc.com/news/articles/c205nxy0p31o
18•bookofjoe•17m ago•7 comments

BookTalk: A Reading Companion That Captures Your Voice

https://github.com/bramses/BookTalk
1•_bramses•18m ago•0 comments

Is AI "good" yet? – tracking HN's sentiment on AI coding

https://www.is-ai-good-yet.com/#home
3•ilyaizen•19m ago•1 comments

Show HN: Amdb – Tree-sitter based memory for AI agents (Rust)

https://github.com/BETAER-08/amdb
1•try_betaer•20m ago•0 comments

OpenClaw Partners with VirusTotal for Skill Security

https://openclaw.ai/blog/virustotal-partnership
2•anhxuan•20m ago•0 comments

Show HN: Seedance 2.0 Release

https://seedancy2.com/
2•funnycoding•20m ago•0 comments

Leisure Suit Larry's Al Lowe on model trains, funny deaths and Disney

https://spillhistorie.no/2026/02/06/interview-with-sierra-veteran-al-lowe/
1•thelok•20m ago•0 comments

Towards Self-Driving Codebases

https://cursor.com/blog/self-driving-codebases
1•edwinarbus•20m ago•0 comments

VCF West: Whirlwind Software Restoration – Guy Fedorkow [video]

https://www.youtube.com/watch?v=YLoXodz1N9A
1•stmw•21m ago•1 comments

Show HN: COGext – A minimalist, open-source system monitor for Chrome (<550KB)

https://github.com/tchoa91/cog-ext
1•tchoa91•22m ago•1 comments

FOSDEM 26 – My Hallway Track Takeaways

https://sluongng.substack.com/p/fosdem-26-my-hallway-track-takeaways
1•birdculture•23m ago•0 comments

Show HN: Env-shelf – Open-source desktop app to manage .env files

https://env-shelf.vercel.app/
1•ivanglpz•26m ago•0 comments

Show HN: Almostnode – Run Node.js, Next.js, and Express in the Browser

https://almostnode.dev/
1•PetrBrzyBrzek•27m ago•0 comments

Dell support (and hardware) is so bad, I almost sued them

https://blog.joshattic.us/posts/2026-02-07-dell-support-lawsuit
1•radeeyate•28m ago•0 comments

Project Pterodactyl: Incremental Architecture

https://www.jonmsterling.com/01K7/
1•matt_d•28m ago•0 comments
Open in hackernews

Nano-Vllm: Lightweight vLLM implementation built from scratch

https://github.com/GeeeekExplorer/nano-vllm
125•simonpure•7mo ago

Comments

unwind•7mo ago
Meta: the Title Casing in the title is pretty obnoxious, "Vllm" is exactly the inverse, casing-wise, of how the project spells its name.
msephton•7mo ago
Fwiw op has a small window of time to correct the casing after posting
futurecliff•7mo ago
how did u do it? which portion of vllm refactoring allowed u to get such gains.
zackify•7mo ago
Will this end up getting an open ai compatible web server or is that out of scope.
jimmySixDOF•7mo ago
Little sparse on the documentation side can't tell at a glance if there is a 1:1 hyperperameter tuneability or if this is an opinionated single path locked soft fpga eval-hacking kind of thing.

EDIT: -- Ok, it's legit, here is an example of it put to use by the makers of the Dolphin OpenSource series of FineTunes:

> Here I implement in nano-vllm, efficient sample-K logit extraction, as described in "Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMs" by Anshumann et. al. Sampling occurs on the GPU, the non-sampled logits do not get copied out of GPU space. I tried to implement this in @vllm_project, but it was a bit too heavy for me to figure out.

https://github.com/GeeeekExplorer/nano-vllm/pull/34

baalimago•7mo ago
So... It's a language model..? As in, not "large"? I'm a bit unsure of the magnitudes here, but surely "nano" and "large" cancel out
IanCal•7mo ago
No, vLLM is a thing for serving language models: https://github.com/vllm-project/vllm
barrenko•7mo ago
Is it more like llama.cpp then? I don't have access to the good hardware.
jasonjmcghee•7mo ago
llama.cpp is optimized to serve one request at a time.

vllm is optimized to serve many requests at one time.

If you were to fine tune a model and wanted to serve it to many users, you would use vllm, not llama.cpp

jasonjmcghee•7mo ago
Here's a super relevant comment from another post https://news.ycombinator.com/item?id=44366418
barrenko•7mo ago
Appreciate it!
fractorial•7mo ago
Did anyone else click in excitedly after misreading ‘Vllm’ as ‘LLVM?’
omneity•7mo ago
This is an incredible achievement for a solo developer. The dev is from the Deepseek team by the way.
Imustaskforhelp•7mo ago
That is crazy! This is so cool ngl.
tt726259•7mo ago
After seeing the Docker image for vllm jump +5Gb (to 10Gb!) over the past five months, I grew suspicious of vllm's development practices [1]. It's not easy, for sure, to deal with all those flaky python modules [2].

But having the CUDA packages four times in different layers is questionable! [3]

Yet again, as a college mate of mine used to say, "Don't change it. It works."

--

[1]: https://hub.docker.com/r/vllm/vllm-openai/tags

[2]: https://github.com/vllm-project/vllm/issues/13306

[3]: These kinds of workarounds tend to end up accumulating and never get reviewed back:

- https://github.com/vllm-project/vllm/commit/b07d741661570ef1...

- https://github.com/vllm-project/vllm/commit/68d37809b9b52f4d... (this one in particular probably accounts for +3Gb)

mountainriver•7mo ago
Love this project, we need more simplifications like this in the current ML environment