frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Level Up Your Gaming

https://d4.h5go.life/
1•LinkLens•12s ago•1 comments

Di.day is a movement to encourage people to ditch Big Tech

https://itsfoss.com/news/di-day-celebration/
1•MilnerRoute•1m ago•0 comments

Show HN: AI generated personal affirmations playing when your phone is locked

https://MyAffirmations.Guru
1•alaserm•2m ago•0 comments

Show HN: GTM MCP Server- Let AI Manage Your Google Tag Manager Containers

https://github.com/paolobietolini/gtm-mcp-server
1•paolobietolini•3m ago•0 comments

Launch of X (Twitter) API Pay-per-Use Pricing

https://devcommunity.x.com/t/announcing-the-launch-of-x-api-pay-per-use-pricing/256476
1•thinkingemote•3m ago•0 comments

Facebook seemingly randomly bans tons of users

https://old.reddit.com/r/facebookdisabledme/
1•dirteater_•5m ago•1 comments

Global Bird Count

https://www.birdcount.org/
1•downboots•5m ago•0 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
2•soheilpro•7m ago•0 comments

Jon Stewart – One of My Favorite People – What Now? With Trevor Noah Podcast [video]

https://www.youtube.com/watch?v=44uC12g9ZVk
1•consumer451•9m ago•0 comments

P2P crypto exchange development company

1•sonniya•23m ago•0 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
1•jesperordrup•28m ago•0 comments

Write for Your Readers Even If They Are Agents

https://commonsware.com/blog/2026/02/06/write-for-your-readers-even-if-they-are-agents.html
1•ingve•28m ago•0 comments

Knowledge-Creating LLMs

https://tecunningham.github.io/posts/2026-01-29-knowledge-creating-llms.html
1•salkahfi•29m ago•0 comments

Maple Mono: Smooth your coding flow

https://font.subf.dev/en/
1•signa11•36m ago•0 comments

Sid Meier's System for Real-Time Music Composition and Synthesis

https://patents.google.com/patent/US5496962A/en
1•GaryBluto•43m ago•1 comments

Show HN: Slop News – HN front page now, but it's all slop

https://dosaygo-studio.github.io/hn-front-page-2035/slop-news
5•keepamovin•44m ago•1 comments

Show HN: Empusa – Visual debugger to catch and resume AI agent retry loops

https://github.com/justin55afdfdsf5ds45f4ds5f45ds4/EmpusaAI
1•justinlord•47m ago•0 comments

Show HN: Bitcoin wallet on NXP SE050 secure element, Tor-only open source

https://github.com/0xdeadbeefnetwork/sigil-web
2•sickthecat•49m ago•1 comments

White House Explores Opening Antitrust Probe on Homebuilders

https://www.bloomberg.com/news/articles/2026-02-06/white-house-explores-opening-antitrust-probe-i...
1•petethomas•49m ago•0 comments

Show HN: MindDraft – AI task app with smart actions and auto expense tracking

https://minddraft.ai
2•imthepk•54m ago•0 comments

How do you estimate AI app development costs accurately?

1•insights123•55m ago•0 comments

Going Through Snowden Documents, Part 5

https://libroot.org/posts/going-through-snowden-documents-part-5/
1•goto1•55m ago•0 comments

Show HN: MCP Server for TradeStation

https://github.com/theelderwand/tradestation-mcp
1•theelderwand•58m ago•0 comments

Canada unveils auto industry plan in latest pivot away from US

https://www.bbc.com/news/articles/cvgd2j80klmo
3•breve•59m ago•1 comments

The essential Reinhold Niebuhr: selected essays and addresses

https://archive.org/details/essentialreinhol0000nieb
1•baxtr•1h ago•0 comments

Rentahuman.ai Turns Humans into On-Demand Labor for AI Agents

https://www.forbes.com/sites/ronschmelzer/2026/02/05/when-ai-agents-start-hiring-humans-rentahuma...
1•tempodox•1h ago•0 comments

StovexGlobal – Compliance Gaps to Note

1•ReviewShield•1h ago•1 comments

Show HN: Afelyon – Turns Jira tickets into production-ready PRs (multi-repo)

https://afelyon.com/
1•AbduNebu•1h ago•0 comments

Trump says America should move on from Epstein – it may not be that easy

https://www.bbc.com/news/articles/cy4gj71z0m0o
7•tempodox•1h ago•4 comments

Tiny Clippy – A native Office Assistant built in Rust and egui

https://github.com/salva-imm/tiny-clippy
1•salvadorda656•1h ago•0 comments
Open in hackernews

Sharing base model in GPU VRAM across multiple inference stack process [video]

https://www.youtube.com/watch?v=OC1yyJo9zpg
7•medicis123•5mo ago

Comments

medicis123•5mo ago
We have just published a short demo of the WoolyAI GPU Hypervisor, showcasing VRAM memory sharing/deduplication. Load a single base model once, then run multiple isolated LoRA stacks or VLLM stacks on the same GPU.

Why this matters

Higher capacity: Share the base model in VRAM; add more adapters or vertical inference stacks per GPU without increasing memory usage.

Isolation & control: Each stack is its own process with independent batching and SLA-aware scheduling.

While vLLM supports multiple adapters on a single vLLM process, many teams need predictable per-adapter SLAs—this is where running independent stacks with a shared base model in VRAM can enable doing it all on the same GPU.

The demo uses LoRA inference using Pytorch, but the same applies when using vLLM. If you’re scaling LoRA inference across business units or model variants and need predictable latency without overprovisioning GPUs, I’d love your feedback. Comment or DM to chat.