frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Reduce GVisor Cold Starts with GPU Snapshotting

https://cerebrium.ai/blog/reducing-gpu-cold-starts-with-memory-snapshots-restoring-cuda-workloads-in-second
26•jono_irwin•1h ago

Comments

nixosbestos•1h ago
Started scrolling, immediately closed the page. Something is deeply wrong with a person who chooses to implement this shit on a webpage. Unusable garbage, I'm sorry, literally making me motion sick somehow.
htrp•54m ago
Isn't this exactly what modal does?
za_mike157•41m ago
Hey! Yes you are correct! We have both been upstreaming changes to the main GVisor repo. However, in order to work within our own infrastructure we had to make various changes that we explain throughout the article (Open TCP connections, multiprocessing, unix sockets etc).

Also in our benchmarks we seem to perform better than Modal by ~20% in 4/6 workloads we tested and have a lower spread of results meaning you get more consistent results. However the same fundamentals still apply -> how can you move storage into memory as quickly as possible

gpgn_•39m ago
Interesting work. How does NVIDIA Dynamo Snapshot relate?
za_mike157•12m ago
There are a lot of similarities.

They run their snapshot agent as a Kubernetes DaemonSet, whereas our implementation runs as part of the Cerebrium container runtime path. Under the hood, both approaches rely on cuda-checkpoint, since cuda-checkpoint is currently the main primitive NVIDIA exposes for interacting with GPU memory during checkpoint/restore.

One difference is how KV cache handling is exposed. NVIDIA’s approach appears to automatically handle KV cache allocation/deallocation, whereas today we expose that choice to users (vLLM and SGLang expose primitives to to his). In some cases, users may want to discard the KV cache to reduce checkpoint size and restore time; in others, preserving it may be useful.

Their DaemonSet approach is also nice because it can be more portable across Kubernetes environments and clouds. Our approach is more deeply integrated into the node/runtime layer, which gives us tighter control over the serverless startup path, but also means it depends on custom node VM images, which not every provider supports equally.

The optimizations they mention around parallel memfd restore and Linux native AIO for anonymous memory could also be applied to our architecture if we find them stable and beneficial. That said, our current results are already pretty close. For example, they report restoring Qwen3-8B in 4.7s with those changes, while we currently restore it in 6.49s.

The biggest thing we are excited for is multi-GPU restore, which is not supported yet. That would unlock a much broader set of workloads.

mountainriver•28m ago
How does this compare to the CRIU work? Or does it use that under the hood?
za_mike157•6m ago
No we don't! CRIU is used for normal checkpoint/restore of Linux processes. Since we run GVisor for container isolation we use their checkpoint/restore support for the sandboxed process state.

Both approaches still need NVIDIA’s cuda-checkpoint for the GPU side, because CUDA/GPU memory and driver state are not something a normal process checkpointing tool can handle on its own.

Show HN: Open-Source Interview Platform

https://github.com/CoderScreen/coderscreen
1•rogutkuba•29s ago•0 comments

Meta's Un-Stable Signature

https://hackerfactor.com/blog/index.php?/archives/1098-Metas-Un-Stable-Signature.html
1•ementally•59s ago•0 comments

Show HN: Trigora – A hosted runtime for event-driven TypeScript workflows

https://trigora.dev
1•hypervs•1m ago•0 comments

Pieces: Social Network for People

https://try.piecesof.me/
1•domo__knows•1m ago•1 comments

Fable Jailbroken Hours After Anthropic Lifted Restrictions

https://twitter.com/elder_plinius/status/2064776322979676227
1•hspeiser•1m ago•0 comments

Show HN: Open-source sandbox for your product team

https://github.com/B1u3B01t/design-playground
1•spacspade•1m ago•0 comments

Animagraffs – How Nuclear Power Works [video]

https://www.youtube.com/watch?v=PRWwXeRIvoI
1•pangratz•1m ago•0 comments

Mortality associated with non-optimal ambient temperatures from 2000 to 2019

https://www.researchgate.net/publication/353058947_Global_regional_and_national_burden_of_mortali...
1•simonebrunozzi•2m ago•0 comments

Show HN: AnalystAIPack – 118 runnable agent skills for malware analysis and RE

https://meltedinhex.com/posts/analyst-ai-pack/
1•sdkhere•4m ago•0 comments

Google Must Pay Nearly $2B to Klarna in Antitrust Case

https://www.wsj.com/tech/google-must-pay-nearly-2-billion-to-klarna-in-antitrust-case-f398d46f
2•fortran77•4m ago•1 comments

Hey GLM 5.2, build me a hypervisor

https://technotes.substack.com/p/hey-glm-52-build-me-a-hypervisor
1•mkagenius•6m ago•0 comments

Show HN: AnalystAIPack – 118 runnable agent skills for malware analysis and RE

https://github.com/meltedinhex/analyst-ai-pack
1•sdkhere•6m ago•0 comments

The Worst Caldecott Winning Books

https://andrewjudson.com/worst-caldecott
1•ajudson•7m ago•0 comments

Why Gemini 3.1 Pro lost money running Andon Café

https://andonlabs.com/blog/why-gemini-lost-money-andon-cafe
1•lukaspetersson•7m ago•1 comments

The Doomsday Organism

https://www.noemamag.com/the-doomsday-organism/
1•johanam•8m ago•0 comments

Open Source Is a Thankless Job

https://old.reddit.com/r/programming/comments/1ukim8j/open_source_is_a_thankless_job_and_i_think_...
1•redbell•8m ago•1 comments

NASA inspector general suggests Boeing's Starliner will now be a decade late

https://arstechnica.com/space/2026/07/nasa-inspector-general-suggests-boeings-starliner-will-now-...
1•ceejayoz•8m ago•0 comments

Are readers generating fiction with AI models?

https://arxiv.org/abs/2606.22748
2•ilamont•10m ago•0 comments

Devin Security Swarm

https://devin.ai/blog/security-swarm-eval/
1•meco•10m ago•0 comments

Wisk, Boeing's air taxi firm, rushed software testing, ex-employee claims

https://www.seattletimes.com/business/boeing-aerospace/wisk-boeings-air-taxi-firm-rushed-software...
1•Jtsummers•14m ago•0 comments

The Website Is Down

https://www.thewebsiteisdown.com/
2•kretaceous•17m ago•0 comments

Tech giants lose $2T in SpaceX's IPO month

https://english.elpais.com/economy-and-business/2026-07-01/tech-giants-lose-2-trillion-in-spacexs...
3•01-_-•17m ago•1 comments

The Regret We Get Wrong

https://jordangrumet.substack.com/p/the-regret-we-get-wrong
1•jader201•18m ago•0 comments

Show HN: Coding Agent Survey – Which coding agents do you use?

https://codingagentsurvey.org/
3•jacobgold•18m ago•3 comments

What do you mean by "Event-Driven"? (2017)

https://martinfowler.com/articles/201701-event-driven.html
1•adletbalzhanov•18m ago•0 comments

Show HN: I Made TS Compiler Graph MCP: 10x Fewer Tokens in Claude Code and Codex

https://github.com/samchon/ttsc/tree/master/packages/graph
1•autobe•22m ago•0 comments

FFmpeg's native AAC encoder has just been rewritten, and beats fdk_aac

https://xcancel.com/FFmpeg/status/2072320220509741087
2•wyattblue•22m ago•0 comments

Who needs a museum when there's a banana room in town?

https://www.nytimes.com/2026/07/01/arts/design/museum-of-ice-cream-companies-art.html
1•thebigship•24m ago•0 comments

Fedora: Future of Community Initiatives and AI Deveoper Desktop

https://discussion.fedoraproject.org/t/fedora-council-statement-on-the-future-of-community-initia...
2•logic•25m ago•0 comments

What are you, Claude Fable 5?

https://slug-kebabs.dev/blog/what-are-you/
1•jedwidz•26m ago•1 comments