frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Sharing base model in GPU VRAM across multiple inference stack process [video]

https://www.youtube.com/watch?v=OC1yyJo9zpg
7•medicis123•5mo ago

Comments

medicis123•5mo ago
We have just published a short demo of the WoolyAI GPU Hypervisor, showcasing VRAM memory sharing/deduplication. Load a single base model once, then run multiple isolated LoRA stacks or VLLM stacks on the same GPU.

Why this matters

Higher capacity: Share the base model in VRAM; add more adapters or vertical inference stacks per GPU without increasing memory usage.

Isolation & control: Each stack is its own process with independent batching and SLA-aware scheduling.

While vLLM supports multiple adapters on a single vLLM process, many teams need predictable per-adapter SLAs—this is where running independent stacks with a shared base model in VRAM can enable doing it all on the same GPU.

The demo uses LoRA inference using Pytorch, but the same applies when using vLLM. If you’re scaling LoRA inference across business units or model variants and need predictable latency without overprovisioning GPUs, I’d love your feedback. Comment or DM to chat.

Zig Package Manager Enhancements

https://ziglang.org/devlog/2026/#2026-02-06
2•jackhalford•1m ago•0 comments

Neutron Scans Reveal Hidden Water in Martian Meteorite

https://www.universetoday.com/articles/neutron-scans-reveal-hidden-water-in-famous-martian-meteorite
1•geox•2m ago•0 comments

Deepfaking Orson Welles's Mangled Masterpiece

https://www.newyorker.com/magazine/2026/02/09/deepfaking-orson-welless-mangled-masterpiece
1•fortran77•3m ago•1 comments

France's homegrown open source online office suite

https://github.com/suitenumerique
2•nar001•6m ago•1 comments

SpaceX Delays Mars Plans to Focus on Moon

https://www.wsj.com/science/space-astronomy/spacex-delays-mars-plans-to-focus-on-moon-66d5c542
1•BostonFern•6m ago•0 comments

Jeremy Wade's Mighty Rivers

https://www.youtube.com/playlist?list=PLyOro6vMGsP_xkW6FXxsaeHUkD5e-9AUa
1•saikatsg•6m ago•0 comments

Show HN: MCP App to play backgammon with your LLM

https://github.com/sam-mfb/backgammon-mcp
1•sam256•8m ago•0 comments

AI Command and Staff–Operational Evidence and Insights from Wargaming

https://www.militarystrategymagazine.com/article/ai-command-and-staff-operational-evidence-and-in...
1•tomwphillips•9m ago•0 comments

Show HN: CCBot – Control Claude Code from Telegram via tmux

https://github.com/six-ddc/ccbot
1•sixddc•10m ago•1 comments

Ask HN: Is the CoCo 3 the best 8 bit computer ever made?

1•amichail•12m ago•0 comments

Show HN: Convert your articles into videos in one click

https://vidinie.com/
2•kositheastro•15m ago•0 comments

Red Queen's Race

https://en.wikipedia.org/wiki/Red_Queen%27s_race
2•rzk•15m ago•0 comments

The Anthropic Hive Mind

https://steve-yegge.medium.com/the-anthropic-hive-mind-d01f768f3d7b
2•gozzoo•17m ago•0 comments

A Horrible Conclusion

https://addisoncrump.info/research/a-horrible-conclusion/
1•todsacerdoti•18m ago•0 comments

I spent $10k to automate my research at OpenAI with Codex

https://twitter.com/KarelDoostrlnck/status/2019477361557926281
2•tosh•19m ago•1 comments

From Zero to Hero: A Spring Boot Deep Dive

https://jcob-sikorski.github.io/me/
1•jjcob_sikorski•19m ago•0 comments

Show HN: Solving NP-Complete Structures via Information Noise Subtraction (P=NP)

https://zenodo.org/records/18395618
1•alemonti06•24m ago•1 comments

Cook New Emojis

https://emoji.supply/kitchen/
1•vasanthv•27m ago•0 comments

Show HN: LoKey Typer – A calm typing practice app with ambient soundscapes

https://mcp-tool-shop-org.github.io/LoKey-Typer/
1•mikeyfrilot•30m ago•0 comments

Long-Sought Proof Tames Some of Math's Unruliest Equations

https://www.quantamagazine.org/long-sought-proof-tames-some-of-maths-unruliest-equations-20260206/
1•asplake•31m ago•0 comments

Hacking the last Z80 computer – FOSDEM 2026 [video]

https://fosdem.org/2026/schedule/event/FEHLHY-hacking_the_last_z80_computer_ever_made/
2•michalpleban•31m ago•0 comments

Browser-use for Node.js v0.2.0: TS AI browser automation parity with PY v0.5.11

https://github.com/webllm/browser-use
1•unadlib•32m ago•0 comments

Michael Pollan Says Humanity Is About to Undergo a Revolutionary Change

https://www.nytimes.com/2026/02/07/magazine/michael-pollan-interview.html
2•mitchbob•32m ago•1 comments

Software Engineering Is Back

https://blog.alaindichiappari.dev/p/software-engineering-is-back
2•alainrk•33m ago•1 comments

Storyship: Turn Screen Recordings into Professional Demos

https://storyship.app/
1•JohnsonZou6523•34m ago•0 comments

Reputation Scores for GitHub Accounts

https://shkspr.mobi/blog/2026/02/reputation-scores-for-github-accounts/
2•edent•37m ago•0 comments

A BSOD for All Seasons – Send Bad News via a Kernel Panic

https://bsod-fas.pages.dev/
1•keepamovin•40m ago•0 comments

Show HN: I got tired of copy-pasting between Claude windows, so I built Orcha

https://orcha.nl
1•buildingwdavid•40m ago•0 comments

Omarchy First Impressions

https://brianlovin.com/writing/omarchy-first-impressions-CEEstJk
2•tosh•46m ago•1 comments

Reinforcement Learning from Human Feedback

https://arxiv.org/abs/2504.12501
7•onurkanbkrc•47m ago•0 comments