news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

KVarN: Native vLLM KV-cache quantization back end by Huawei

https://github.com/huawei-csl/KVarN

44•theanonymousone•1h ago

Comments

v3ss0n•1h ago

Why this is not a PR for vLLM ?

esafak•1h ago

It's the output of a research paper; the authors are not trying to build up vLLM, and they probably have no incentive to do so. You can submit a PR, though! It's easier now while the divergence is low, so don't wait. Since there are six authors, I bet you could get help with the inevitable review chores if you just take the step of creating the PR.

edit: It might not be clear that it is based on vLLM 0.22, which is the current version: https://github.com/huawei-csl/KVarN/commit/d6290e99098d7426d.... All you have to do is create a diff off it; it's fairly straightforward.

jmalicki•54m ago

And with the help of AI, pointing at AI at this paper and saying "making a vLLM PR from this paper" tends to work surprisingly well, even if you need to nudge it a little bit along the way.

throwa356262•1h ago

Better performance than TQ and better quality than FP16?

Am I reading this right??

thefox96•6m ago

Faster than Fp16, not better quality i guess

qeternity•4m ago

It's not better quality: 59.3% vs 59.4% fp16 on AIME 25

VoidZero Is Joining Cloudflare

https://blog.cloudflare.com/voidzero-joins-cloudflare/

362•coloneltcb•4h ago•185 comments

KVarN: Native vLLM KV-cache quantization back end by Huawei

https://github.com/huawei-csl/KVarN

44•theanonymousone•1h ago•6 comments

Ian's Secure Shoelace Knot

https://www.fieggen.com/shoelace/secureknot.htm

267•mooreds•5h ago•105 comments

Now Is the Best Time to Be a Duct Tape Engineer

https://derwiki.medium.com/now-is-the-best-time-to-be-a-duct-tape-engineer-eefc1d141c23

43•derwiki•3d ago•24 comments

They’re made out of weights

https://maxleiter.com/blog/weights

1158•MaxLeiter•17h ago•495 comments

Zettascale (YC S24) Is Hiring Founding FPGA Engineers

https://www.ycombinator.com/companies/zettascale/jobs/O9S1vqO-founding-engineer-fpga-rtl-asic-arc...

1•el_al•8m ago

Gaussian Point Splatting

https://momentsingraphics.de/Siggraph2026.html

134•ibobev•6h ago•46 comments

U.S. Army Corps of Engineers Bay Model

https://en.wikipedia.org/wiki/U.S._Army_Corps_of_Engineers_Bay_Model

139•tosh•1d ago•38 comments

3D-printed book turns its own G-code into raised lettering

https://www.designboom.com/design/3d-printed-book-manual-darius-ou-benson-chong/

29•surprisetalk•2d ago•14 comments

In a first, wind and solar generated more power than gas globally in April 2026

https://electrek.co/2026/05/20/in-a-first-wind-solar-generated-more-power-than-gas-globally-april...

207•speckx•2h ago•188 comments

Elixir v1.20: Now a gradually typed language

https://elixir-lang.org/blog/2026/06/03/elixir-v1-20-0-released/

911•cloud8421•22h ago•364 comments

French-Iranian author Marjane Satrapi, author of 'Persepolis', dies at 56

https://www.france24.com/en/culture/20260604-french-iranian-author-marjane-satrapi-author-of-pers...

299•fidotron•5h ago•85 comments

Show HN: Open Terminal – A Bloomberg Style App for Research

https://tesseractanalytics.ai/

5•tessbi•1h ago•5 comments

The LLM warnings Google fired Timnit Gebru over have all come true

https://www.tumblr.com/dreaminginthedeepsouth/817865966907228160/darren-oconnor-timnit-gebru-was-...

57•thdr•1h ago•20 comments

Gemma 4 12B: A unified, encoder-free multimodal model

https://blog.google/innovation-and-ai/technology/developers-tools/introducing-gemma-4-12b/

962•rvz•1d ago•362 comments

Show HN: Prela – Purely Algebraic Relation Combinators

https://github.com/remysucre/prela

35•remywang•3d ago•6 comments

I built a vulnerable app and spent $1,500 seeing if LLMs could hack it

https://kasra.blog/blog/i-spent-1500-seeing-if-llms-could-hack-my-app/

330•jc4p•16h ago•174 comments

Google Employees Internally Share Memes About How Its AI Sucks

https://www.404media.co/google-employees-internally-share-memes-about-how-its-ai-sucks/

102•elorant•1h ago•68 comments

Artificial intelligence is not conscious

https://www.theatlantic.com/philosophy/2026/06/no-artificial-intelligence-is-not-conscious/687378/

634•lordleft•23h ago•1049 comments

Under Notre Dame, a 'dig of the century' unearths 1,700 years of history

https://apnews.com/article/notre-dame-dig-treasures-paris-archaeology-roman-dae41f792c1402faf32a8...

130•cobbzilla•2d ago•31 comments

Sum-product, unit distances, and number fields

https://www.erdosproblems.com/forum/thread/blog:6

3•robinhouston•3d ago•0 comments

UK media fails to disclose defence sector links in nearly 60% of cases

https://aoav.org.uk/2026/military-experts-or-arms-industry-insiders-uk-media-fails-to-disclose-de...

338•XzetaU8•8h ago•192 comments

Show HN: Boxes.dev: ditch localhost; run Claude Code and Codex in the cloud

https://boxes.dev

42•nab•2h ago•14 comments

I was recently diagnosed with anti-NMDA receptor encephalitis

https://burntsushi.net/encephalitis/

701•Tomte•1d ago•226 comments

The ways we contain Claude across products

https://www.anthropic.com/engineering/how-we-contain-claude

191•jbredeche•16h ago•84 comments

Uber's $1,500/month AI limit is a useful signal for AI tool pricing

https://simonwillison.net/2026/Jun/3/uber-caps-usage/

573•pdyc•1d ago•701 comments

Learn SQL Once, Use It for 30 Years

https://fagnerbrack.com/learn-sql-once-use-it-for-30-years-9aceb0bdee03

214•karakoram•3d ago•168 comments

thunderbolt-ibverbs: We have InfiniBand at home

https://blog.hellas.ai/blog/thunderbolt-ibverbs/

99•zdw•2d ago•7 comments

Failing grades soar with AI usage, dwindling math skills in Berkeley CS classes

https://www.dailycal.org/news/campus/academics/failing-grades-soar-as-professors-see-greater-ai-u...

574•littlexsparkee•16h ago•544 comments

DaVinci Resolve 21

https://www.blackmagicdesign.com/products/davinciresolve/whatsnew

516•pentagrama•1d ago•229 comments