frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: KV Marketplace – share LLM attention caches across GPUs like memcached

https://github.com/neelsomani/kv-marketplace
2•nsomani•13h ago

Comments

nsomani•13h ago
Hi all - this is a small research prototype I built to explore cross-GPU reuse of transformer attention states.

When inference engines like vLLM implement prefix/KV caching, it's local to each replica. LMCache recently generalized this idea to multi-tier storage.

KV Marketplace focuses narrowly on the GPU-to-GPU fast path: peer-to-peer prefix reuse over RDMA or NVLink. Each process exports completed prefix KV tensors (key/value attention states) into a registry keyed by a hash of the input tokens and model version. Other processes with the same prefix can import those tensors directly from a peer GPU, bypassing host memory and avoiding redundant prefill compute.

Under optimistic conditions (perfect prefix importing), the prototype shows about a 15% reduction in latency and throughput gains without heavy tuning. The code is intentionally minimal (no distributed registry, eviction, or CPU/disk tiers yet) but it's a prototype of "memcached for attention."

I thought others exploring distributed LLM inference, caching, or RDMA transports might find the repo useful or interesting.

Show HN: Akashi Notari – On-chain Proof of Existence for any file in 60s for <$1

https://akashi-notari.com/
2•takeshi_w•1h ago•0 comments

Show HN: AI Bubble Monitor

https://aibubblemonitor.com
4•itsnotmyai•2h ago•0 comments

Show HN: I made an open-source Rust program for memory-efficient genomics

https://github.com/logannye/rosalind
10•logannyeMD•8h ago•0 comments

Show HN: Gerbil – an open source desktop app for running LLMs locally

https://github.com/lone-cloud/gerbil
30•lone-cloud•2d ago•6 comments

Show HN: Cancer diagnosis makes for an interesting RL environment for LLMs

41•dchu17•18h ago•20 comments

Show HN: I built a platform where audiences fund debates between public thinkers

https://logosive.com
32•mcastle•14h ago•32 comments

Show HN: Chime – Full-screen meeting alerts for time blindess (macOS)

https://www.usechime.app/
3•tsormed•8h ago•0 comments

Show HN: Cactoide – Federated RSVP Platform

https://cactoide.org/
66•orbanlevi•1d ago•28 comments

Show HN: SkillGraph – Open-source agentic framework with skills instead of tools

https://github.com/tejassudsfp/skillgraph-backend
12•tejassuds•21h ago•0 comments

Show HN: Made MadLibs-style game to play with my kids

https://www.storygaps.org/
2•ronbenton•9h ago•0 comments

Show HN: ShellDash – Browser server dashboard with SSH and globe monitoring

https://shelldash.com
4•mannders•15h ago•0 comments

Show HN: Invisitris a Tetris-like game, where the placed pieces become invisible

https://invisitris.bitechunk.com/
4•eddguzzo•11h ago•1 comments

Show HN: Data Formulator – interactive AI agents for data analysis (Microsoft)

https://data-formulator.ai/
36•chenglong-hn•1d ago•11 comments

Show HN: Tusk Drift – Open-source tool for automating API tests

https://github.com/Use-Tusk/drift-node-sdk
53•Marceltan•1d ago•17 comments

Show HN: The Prompt Engineering Bible – Complete Guide to AI Communication

https://dimitriosmitsos.gumroad.com/l/prompt-engineering-bible
3•Cranot•12h ago•0 comments

Show HN: Open-Source LaTeX OCR, Alternative to Mathpix/SimpleTex

https://texocr.netlify.app/
3•alephpi•19h ago•0 comments

Show HN: KV Marketplace – share LLM attention caches across GPUs like memcached

https://github.com/neelsomani/kv-marketplace
2•nsomani•13h ago•1 comments

Show HN: SecurVO – Compliance management for service businesses

https://securVO.com
2•AaronKushner•13h ago•0 comments

Show HN: ChatExport Structurer – parse ChatGPT/Claude exports into queryable SQL

https://github.com/1ch1n/chat-export-structurer
2•chan1•14h ago•0 comments

Show HN: Venturu – Zillow for the market of local businesses

https://www.venturu.com
32•lifenautjoe•1d ago•36 comments

Show HN: Get an email when your favorite director releases a movie

https://www.premierepal.com/
2•samteeeee•14h ago•0 comments

Show HN: Built a tiny interpreter from scratch in C to understand how they work

https://github.com/renvins/toyforth-interpreter
4•renvins•14h ago•0 comments

Show HN: Creavi Macropad – Built a wireless macropad with a display

https://creavi.tech/blog/creavi-macropad-build-log/
31•cmpx•1d ago•7 comments

Show HN: Gametje – A casual online gaming platform

https://gametje.com
111•jmpavlec•1d ago•40 comments

Show HN: YaraDB – Lightweight open-source document database built with FastAPI

https://github.com/illusiOxd/yaradb
9•ashfromsky•1d ago•1 comments

Show HN: What Is Hacker News Working On?

https://waywo.eamag.me/
221•eamag•6d ago•51 comments

Show HN: Vibemail – AI-powered MJML email editor that runs in the browser

https://vibemail-beta.vercel.app/
2•elliotbnvl•16h ago•0 comments

Show HN: DeltaGlider – Store 4TB of build artifacts in 5GB

https://github.com/beshu-tech/deltaglider
7•sscarduzio•1d ago•2 comments

Show HN: SQL++ – 5x faster than Prisma (Rust)

https://github.com/sinisterMage/sqlpp
2•SinisterMage2•18h ago•2 comments

Show HN: JavaScript Engines Zoo

https://github.com/ivankra/javascript-zoo
2•ivankra•19h ago•0 comments