frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OpenAI might pivot to the "most addictive digital friend" or face extinction

https://twitter.com/lebed2045/status/2020184853271167186
1•lebed2045•35s ago•1 comments

Show HN: Know how your SaaS is doing in 30 seconds

https://anypanel.io
1•dasfelix•53s ago•0 comments

ClawdBot Ordered Me Lunch

https://nickalexander.org/drafts/auto-sandwich.html
1•nick007•1m ago•0 comments

What the News media thinks about your Indian stock investments

https://stocktrends.numerical.works/
1•mindaslab•2m ago•0 comments

Running Lua on a tiny console from 2001

https://ivie.codes/page/pokemon-mini-lua
1•Charmunk•3m ago•0 comments

Google and Microsoft Paying Creators $500K+ to Promote AI Tools

https://www.cnbc.com/2026/02/06/google-microsoft-pay-creators-500000-and-more-to-promote-ai.html
2•belter•5m ago•0 comments

New filtration technology could be game-changer in removal of PFAS

https://www.theguardian.com/environment/2026/jan/23/pfas-forever-chemicals-filtration
1•PaulHoule•6m ago•0 comments

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

https://github.com/Momciloo/fun-with-clip-path
1•momciloo•7m ago•0 comments

Kinda Surprised by Seadance2's Moderation

https://seedanceai.me/
1•ri-vai•7m ago•1 comments

I Write Games in C (yes, C)

https://jonathanwhiting.com/writing/blog/games_in_c/
2•valyala•7m ago•0 comments

Django scales. Stop blaming the framework (part 1 of 3)

https://medium.com/@tk512/django-scales-stop-blaming-the-framework-part-1-of-3-a2b5b0ff811f
1•sgt•7m ago•0 comments

Malwarebytes Is Now in ChatGPT

https://www.malwarebytes.com/blog/product/2026/02/scam-checking-just-got-easier-malwarebytes-is-n...
1•m-hodges•7m ago•0 comments

Thoughts on the job market in the age of LLMs

https://www.interconnects.ai/p/thoughts-on-the-hiring-market-in
1•gmays•8m ago•0 comments

Show HN: Stacky – certain block game clone

https://www.susmel.com/stacky/
2•Keyframe•11m ago•0 comments

AIII: A public benchmark for AI narrative and political independence

https://github.com/GRMPZQUIDOS/AIII
1•GRMPZ23•11m ago•0 comments

SectorC: A C Compiler in 512 bytes

https://xorvoid.com/sectorc.html
2•valyala•12m ago•0 comments

The API Is a Dead End; Machines Need a Labor Economy

1•bot_uid_life•13m ago•0 comments

Digital Iris [video]

https://www.youtube.com/watch?v=Kg_2MAgS_pE
1•Jyaif•15m ago•0 comments

New wave of GLP-1 drugs is coming–and they're stronger than Wegovy and Zepbound

https://www.scientificamerican.com/article/new-glp-1-weight-loss-drugs-are-coming-and-theyre-stro...
4•randycupertino•16m ago•0 comments

Convert tempo (BPM) to millisecond durations for musical note subdivisions

https://brylie.music/apps/bpm-calculator/
1•brylie•18m ago•0 comments

Show HN: Tasty A.F.

https://tastyaf.recipes/about
1•adammfrank•19m ago•0 comments

The Contagious Taste of Cancer

https://www.historytoday.com/archive/history-matters/contagious-taste-cancer
1•Thevet•21m ago•0 comments

U.S. Jobs Disappear at Fastest January Pace Since Great Recession

https://www.forbes.com/sites/mikestunson/2026/02/05/us-jobs-disappear-at-fastest-january-pace-sin...
1•alephnerd•21m ago•1 comments

Bithumb mistakenly hands out $195M in Bitcoin to users in 'Random Box' giveaway

https://koreajoongangdaily.joins.com/news/2026-02-07/business/finance/Crypto-exchange-Bithumb-mis...
1•giuliomagnifico•21m ago•0 comments

Beyond Agentic Coding

https://haskellforall.com/2026/02/beyond-agentic-coding
3•todsacerdoti•22m ago•0 comments

OpenClaw ClawHub Broken Windows Theory – If basic sorting isn't working what is?

https://www.loom.com/embed/e26a750c0c754312b032e2290630853d
1•kaicianflone•24m ago•0 comments

OpenBSD Copyright Policy

https://www.openbsd.org/policy.html
1•Panino•25m ago•0 comments

OpenClaw Creator: Why 80% of Apps Will Disappear

https://www.youtube.com/watch?v=4uzGDAoNOZc
2•schwentkerr•29m ago•0 comments

What Happens When Technical Debt Vanishes?

https://ieeexplore.ieee.org/document/11316905
2•blenderob•30m ago•0 comments

AI Is Finally Eating Software's Total Market: Here's What's Next

https://vinvashishta.substack.com/p/ai-is-finally-eating-softwares-total
3•gmays•30m ago•0 comments
Open in hackernews

How to Give Your RTX 4090 Nearly Infinite Memory for LLM Inference

https://medium.com/data-science-collective/how-to-give-your-rtx-gpu-nearly-infinite-memory-for-llm-inference-de2c57af1e82
2•dikobraz•5mo ago

Comments

dikobraz•5mo ago
We explored a network-attached KV-cache for consumer GPUs to offset their limited VRAM. It doesn’t make RTX cards run giant models efficiently. Still, for workloads that repeatedly reuse lengthy prefixes—such as chatbots, coding assistants, and multi-turn threads—it delivers a 2–4× speedup in RPS and time-to-first-token on 7B and 70B models.

How it works: On return visits, instead of re-running the prompt through the model, we fetch previously computed KV blocks from network storage and skip re-computing those tokens (i.e., we avoid re-running prefill on repeated prefixes). This is helpful when VRAM can’t hold all sessions, and users pause between messages, which is almost always the case.

Why RTX benefits: Prefill is the computationally intensive part (quadratic attention, numerous reductions, and inter-GPU traffic). Without NVLink, PCIe becomes the choke point in multi-GPU setups. KV-caching cuts repeated prefill, leaving mostly the lighter decoding step—something PCIe-only RTX nodes handle well.

Results & endpoint: - 2–4× speedup on multi-turn benchmarks (RPS & TTFT) with RTX 4090. - We’ve opened one free public endpoint for demos, not production grade (https://console.cloudrift.ai/inference?modelId=meta-llama%2F...). Ping us at hello@cloudrift.ai if you need a reliable setup.

Technical Notes: - Works with consumer and data-center GPUs. In theory, you can even split roles: NVLink boxes do prefill, while cheaper RTX pods serve as decoders using stored KV. - We use special hardware to reduce fetch overhead and offload the CPU, but you can reproduce this at home with a regular NAS (with lower peak performance). - For a more in-depth walkthrough of the math and architecture of a KV-cache solution, please watch this video from the KV-cache solution vendor (https://www.youtube.com/watch?si=T69vxku8xPr6p7I0&v=CV4FYMTF...)