frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

The protein denitrosylase SCoR2 regulates lipogenesis and fat storage [pdf]

https://www.science.org/doi/10.1126/scisignal.adv0660
1•thunderbong•1m ago•0 comments

Los Alamos Primer

https://blog.szczepan.org/blog/los-alamos-primer/
1•alkyon•3m ago•0 comments

NewASM Virtual Machine

https://github.com/bracesoftware/newasm
1•DEntisT_•5m ago•0 comments

Terminal-Bench 2.0 Leaderboard

https://www.tbench.ai/leaderboard/terminal-bench/2.0
1•tosh•6m ago•0 comments

I vibe coded a BBS bank with a real working ledger

https://mini-ledger.exe.xyz/
1•simonvc•6m ago•1 comments

The Path to Mojo 1.0

https://www.modular.com/blog/the-path-to-mojo-1-0
1•tosh•9m ago•0 comments

Show HN: I'm 75, building an OSS Virtual Protest Protocol for digital activism

https://github.com/voice-of-japan/Virtual-Protest-Protocol/blob/main/README.md
4•sakanakana00•12m ago•0 comments

Show HN: I built Divvy to split restaurant bills from a photo

https://divvyai.app/
3•pieterdy•14m ago•0 comments

Hot Reloading in Rust? Subsecond and Dioxus to the Rescue

https://codethoughts.io/posts/2026-02-07-rust-hot-reloading/
3•Tehnix•15m ago•1 comments

Skim – vibe review your PRs

https://github.com/Haizzz/skim
2•haizzz•17m ago•1 comments

Show HN: Open-source AI assistant for interview reasoning

https://github.com/evinjohnn/natively-cluely-ai-assistant
4•Nive11•17m ago•6 comments

Tech Edge: A Living Playbook for America's Technology Long Game

https://csis-website-prod.s3.amazonaws.com/s3fs-public/2026-01/260120_EST_Tech_Edge_0.pdf?Version...
2•hunglee2•20m ago•0 comments

Golden Cross vs. Death Cross: Crypto Trading Guide

https://chartscout.io/golden-cross-vs-death-cross-crypto-trading-guide
2•chartscout•23m ago•0 comments

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/
3•AlexeyBrin•26m ago•0 comments

What the longevity experts don't tell you

https://machielreyneke.com/blog/longevity-lessons/
2•machielrey•27m ago•1 comments

Monzo wrongly denied refunds to fraud and scam victims

https://www.theguardian.com/money/2026/feb/07/monzo-natwest-hsbc-refunds-fraud-scam-fos-ombudsman
3•tablets•32m ago•1 comments

They were drawn to Korea with dreams of K-pop stardom – but then let down

https://www.bbc.com/news/articles/cvgnq9rwyqno
2•breve•34m ago•0 comments

Show HN: AI-Powered Merchant Intelligence

https://nodee.co
1•jjkirsch•37m ago•0 comments

Bash parallel tasks and error handling

https://github.com/themattrix/bash-concurrent
2•pastage•37m ago•0 comments

Let's compile Quake like it's 1997

https://fabiensanglard.net/compile_like_1997/index.html
2•billiob•37m ago•0 comments

Reverse Engineering Medium.com's Editor: How Copy, Paste, and Images Work

https://app.writtte.com/read/gP0H6W5
2•birdculture•43m ago•0 comments

Go 1.22, SQLite, and Next.js: The "Boring" Back End

https://mohammedeabdelaziz.github.io/articles/go-next-pt-2
1•mohammede•49m ago•0 comments

Laibach the Whistleblowers [video]

https://www.youtube.com/watch?v=c6Mx2mxpaCY
1•KnuthIsGod•50m ago•1 comments

Slop News - The Front Page right now but it's only Slop

https://slop-news.pages.dev/slop-news
1•keepamovin•54m ago•1 comments

Economists vs. Technologists on AI

https://ideasindevelopment.substack.com/p/economists-vs-technologists-on-ai
1•econlmics•57m ago•0 comments

Life at the Edge

https://asadk.com/p/edge
4•tosh•1h ago•0 comments

RISC-V Vector Primer

https://github.com/simplex-micro/riscv-vector-primer/blob/main/index.md
4•oxxoxoxooo•1h ago•1 comments

Show HN: Invoxo – Invoicing with automatic EU VAT for cross-border services

2•InvoxoEU•1h ago•0 comments

A Tale of Two Standards, POSIX and Win32 (2005)

https://www.samba.org/samba/news/articles/low_point/tale_two_stds_os2.html
4•goranmoomin•1h ago•0 comments

Ask HN: Is the Downfall of SaaS Started?

4•throwaw12•1h ago•0 comments
Open in hackernews

Reuse non-prefix KV Cache and speed up RAG by 3X with LMCache

https://github.com/LMCache/LMCache-Examples/blob/main/demo-rag-blending/README.md
5•lihanc111•7mo ago

Comments

lihanc111•7mo ago
Hey HN Community!

A while back, we shared our open-source project LMCache here and were blown away by the incredible support and feedback. Today, our team is thrilled to share more about one of our core components: CacheBlend. Recognized with a Best Paper Award at ACM EuroSys 2025, this technique is a pain killer for efficient RAG applications The Problem: Your KV Cache is Wasting Potential In modern LLM applications like RAG and Agents, we constantly feed the model new context. For example, in RAG, we retrieve relevant documents and stuff them into the prompt.

The issue is that this dynamically retrieved context doesn't always appear at the beginning of the input sequence. Traditional KV caching only reuses a "common prefix," so if the new information isn't at the very start, the cache hit rate plummets, and your GPU ends up recomputing the same things over and over. The Solution: CacheBlend - 100% Hit Rate, No Compromises CacheBlend changes the game by allowing for the reuse of pre-computed KV caches regardless of their position in the input sequence.

This means we can finally achieve a 100% KV Cache hit rate in applications like RAG. The performance gains are significant:

Faster Time-To-First-Token (TTFT): Get your initial response much quicker.

More Throughput: Serve significantly more users with the same hardware.

Almost lossless Output Quality: All of this is achieved with little degradation in the model's generation quality.

How does it work? CacheBlend intelligently handles the two main challenges of reusing non-prefix caches: Positional Encoding Update: It efficiently updates positional encodings to ensure the model always knows the correct position of each token, even when we're stitching together cached and new data.

Selective Attention Recalculation: Instead of recomputing everything, it strategically recalculates only the minimal cross-attention needed between the new and cached chunks to maintain perfect generation quality.

For detailed analysis, please refer to the official paper: https://dl.acm.org/doi/10.1145/3689031.3696098

Where can I try it? Our official repo is at: https://github.com/LMCache/LMCache The newest interactive CacheBlend demo at: https://github.com/LMCache/LMCache-Examples/tree/main/demo-r...

Ask us anything!