frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

The rocky 1960s origins of online dating (2025)

https://www.bbc.com/culture/article/20250206-the-rocky-1960s-origins-of-online-dating
1•1659447091•2m ago•0 comments

Show HN: Agent-fetch – Sandboxed HTTP client with SSRF protection for AI agents

https://github.com/Parassharmaa/agent-fetch
1•paraaz•3m ago•0 comments

Why there is no official statement from Substack about the data leak

https://techcrunch.com/2026/02/05/substack-confirms-data-breach-affecting-email-addresses-and-pho...
4•witnessme•7m ago•1 comments

Effects of Zepbound on Stool Quality

https://twitter.com/ScottHickle/status/2020150085296775300
1•aloukissas•11m ago•0 comments

Show HN: Seedance 2.0 – The Most Powerful AI Video Generator

https://seedance.ai/
1•bigbromaker•14m ago•0 comments

Ask HN: Do we need "metadata in source code" syntax that LLMs will never delete?

1•andrewstuart•19m ago•1 comments

Pentagon cutting ties w/ "woke" Harvard, ending military training & fellowships

https://www.cbsnews.com/news/pentagon-says-its-cutting-ties-with-woke-harvard-discontinuing-milit...
5•alephnerd•22m ago•1 comments

Can Quantum-Mechanical Description of Physical Reality Be Considered Complete? [pdf]

https://cds.cern.ch/record/405662/files/PhysRev.47.777.pdf
1•northlondoner•22m ago•1 comments

Kessler Syndrome Has Started [video]

https://www.tiktok.com/@cjtrowbridge/video/7602634355160206623
1•pbradv•25m ago•0 comments

Complex Heterodynes Explained

https://tomverbeure.github.io/2026/02/07/Complex-Heterodyne.html
3•hasheddan•26m ago•0 comments

EVs Are a Failed Experiment

https://spectator.org/evs-are-a-failed-experiment/
3•ArtemZ•37m ago•5 comments

MemAlign: Building Better LLM Judges from Human Feedback with Scalable Memory

https://www.databricks.com/blog/memalign-building-better-llm-judges-human-feedback-scalable-memory
1•superchink•38m ago•0 comments

CCC (Claude's C Compiler) on Compiler Explorer

https://godbolt.org/z/asjc13sa6
2•LiamPowell•40m ago•0 comments

Homeland Security Spying on Reddit Users

https://www.kenklippenstein.com/p/homeland-security-spies-on-reddit
3•duxup•43m ago•0 comments

Actors with Tokio (2021)

https://ryhl.io/blog/actors-with-tokio/
1•vinhnx•44m ago•0 comments

Can graph neural networks for biology realistically run on edge devices?

https://doi.org/10.21203/rs.3.rs-8645211/v1
1•swapinvidya•56m ago•1 comments

Deeper into the shareing of one air conditioner for 2 rooms

1•ozzysnaps•58m ago•0 comments

Weatherman introduces fruit-based authentication system to combat deep fakes

https://www.youtube.com/watch?v=5HVbZwJ9gPE
3•savrajsingh•59m ago•0 comments

Why Embedded Models Must Hallucinate: A Boundary Theory (RCC)

http://www.effacermonexistence.com/rcc-hn-1-1
1•formerOpenAI•1h ago•2 comments

A Curated List of ML System Design Case Studies

https://github.com/Engineer1999/A-Curated-List-of-ML-System-Design-Case-Studies
3•tejonutella•1h ago•0 comments

Pony Alpha: New free 200K context model for coding, reasoning and roleplay

https://ponyalpha.pro
1•qzcanoe•1h ago•1 comments

Show HN: Tunbot – Discord bot for temporary Cloudflare tunnels behind CGNAT

https://github.com/Goofygiraffe06/tunbot
2•g1raffe•1h ago•0 comments

Open Problems in Mechanistic Interpretability

https://arxiv.org/abs/2501.16496
2•vinhnx•1h ago•0 comments

Bye Bye Humanity: The Potential AMOC Collapse

https://thatjoescott.com/2026/02/03/bye-bye-humanity-the-potential-amoc-collapse/
3•rolph•1h ago•0 comments

Dexter: Claude-Code-Style Agent for Financial Statements and Valuation

https://github.com/virattt/dexter
1•Lwrless•1h ago•0 comments

Digital Iris [video]

https://www.youtube.com/watch?v=Kg_2MAgS_pE
1•vermilingua•1h ago•0 comments

Essential CDN: The CDN that lets you do more than JavaScript

https://essentialcdn.fluidity.workers.dev/
1•telui•1h ago•1 comments

They Hijacked Our Tech [video]

https://www.youtube.com/watch?v=-nJM5HvnT5k
2•cedel2k1•1h ago•0 comments

Vouch

https://twitter.com/mitchellh/status/2020252149117313349
41•chwtutha•1h ago•7 comments

HRL Labs in Malibu laying off 1/3 of their workforce

https://www.dailynews.com/2026/02/06/hrl-labs-cuts-376-jobs-in-malibu-after-losing-government-work/
4•osnium123•1h ago•1 comments
Open in hackernews

Nano-Vllm: Lightweight vLLM implementation built from scratch

https://github.com/GeeeekExplorer/nano-vllm
125•simonpure•7mo ago

Comments

unwind•7mo ago
Meta: the Title Casing in the title is pretty obnoxious, "Vllm" is exactly the inverse, casing-wise, of how the project spells its name.
msephton•7mo ago
Fwiw op has a small window of time to correct the casing after posting
futurecliff•7mo ago
how did u do it? which portion of vllm refactoring allowed u to get such gains.
zackify•7mo ago
Will this end up getting an open ai compatible web server or is that out of scope.
jimmySixDOF•7mo ago
Little sparse on the documentation side can't tell at a glance if there is a 1:1 hyperperameter tuneability or if this is an opinionated single path locked soft fpga eval-hacking kind of thing.

EDIT: -- Ok, it's legit, here is an example of it put to use by the makers of the Dolphin OpenSource series of FineTunes:

> Here I implement in nano-vllm, efficient sample-K logit extraction, as described in "Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMs" by Anshumann et. al. Sampling occurs on the GPU, the non-sampled logits do not get copied out of GPU space. I tried to implement this in @vllm_project, but it was a bit too heavy for me to figure out.

https://github.com/GeeeekExplorer/nano-vllm/pull/34

baalimago•7mo ago
So... It's a language model..? As in, not "large"? I'm a bit unsure of the magnitudes here, but surely "nano" and "large" cancel out
IanCal•7mo ago
No, vLLM is a thing for serving language models: https://github.com/vllm-project/vllm
barrenko•7mo ago
Is it more like llama.cpp then? I don't have access to the good hardware.
jasonjmcghee•7mo ago
llama.cpp is optimized to serve one request at a time.

vllm is optimized to serve many requests at one time.

If you were to fine tune a model and wanted to serve it to many users, you would use vllm, not llama.cpp

jasonjmcghee•7mo ago
Here's a super relevant comment from another post https://news.ycombinator.com/item?id=44366418
barrenko•7mo ago
Appreciate it!
fractorial•7mo ago
Did anyone else click in excitedly after misreading ‘Vllm’ as ‘LLVM?’
omneity•7mo ago
This is an incredible achievement for a solo developer. The dev is from the Deepseek team by the way.
Imustaskforhelp•7mo ago
That is crazy! This is so cool ngl.
tt726259•7mo ago
After seeing the Docker image for vllm jump +5Gb (to 10Gb!) over the past five months, I grew suspicious of vllm's development practices [1]. It's not easy, for sure, to deal with all those flaky python modules [2].

But having the CUDA packages four times in different layers is questionable! [3]

Yet again, as a college mate of mine used to say, "Don't change it. It works."

--

[1]: https://hub.docker.com/r/vllm/vllm-openai/tags

[2]: https://github.com/vllm-project/vllm/issues/13306

[3]: These kinds of workarounds tend to end up accumulating and never get reviewed back:

- https://github.com/vllm-project/vllm/commit/b07d741661570ef1...

- https://github.com/vllm-project/vllm/commit/68d37809b9b52f4d... (this one in particular probably accounts for +3Gb)

mountainriver•7mo ago
Love this project, we need more simplifications like this in the current ML environment