frontpage.

Hi HN,

I am an early adopter of containers with a background in HPC. From the early days, I’ve tried to merge container tech into the HPC (and HTC) stack. Containers already make packing and deployment easier - especially in AI/ML and data science. How about checkpoint/restore?

Over the last couple of months, we at MemVerge have developed a Kubernetes Operator for transparent checkpointing and restoring, allowing you to use discounted Spot instances for long-running workloads, like bioinformatics workflows or ML training.

Here’s how it works: - the operator attaches a PVC to your pod - intercepts the STOP signal to checkpoint the pod - if the attached PVC contains a checkpoint when the pod is starting over, it will be restored instead of starting from scratch.

Here’s a 2m30s video that demonstrates interrupting a small training workload: https://youtu.be/K9yY6_2255Y

This can be triggered by someone draining the node (e.g., due to an EC2 Spot reclaim), deleting a pod, or another operator acting on its own logic. Our checkpoint engine captures every aspect of the process tree within the container: memory pages, file descriptors—even TCP connections, if you want us to. Until recently, it was targeted at CPU use cases only. We’ve now added support for NVIDIA GPUs, with AMD GPUs coming soon (via upstream CRIU plugins).

I’ve done some typical checkpoint/restore work (e.g., Jupyter notebooks, traditional jobs) and would love to hear what kinds of workloads you’re interested in checkpointing and restoring.

You can try it out in your Kubernetes environment with our 60-day trial: https://form.typeform.com/to/vZujMYxI

Language Transfer – The Thinking Method (free language courses)

Geodesy for the Layman (1984)

Show HN: StopAddict – Quit addictions with a clean, gamified tracker

GUI in Pure Rust

Show HN: I Made an Extension That Makes You Money

Rust 1.88.0 hits stable with let-chains support

You Don't Own the Word "Freedom"

Show HN: Zenta – Mindfulness for Terminal Users

Show HN: Zeptaframe – Open-source click-and-drag precision for AI video gen

DeepSeek R2 launch stalled as CEO balks at progress

Extending Anthropic's Agent Workflows with Recursive Planning

Show HN: 10x Kubernetes Cluster on Hetzner Cloud

Get AI-powered command suggestions **directly** in your zsh shell

Apple reveals complex system of App Store fees to avoid E.U. fine of 500M euros

Windows Resiliency Initiative: Building resilience for a future-ready enterprise

Why Go Rocks for Building a Lua Interpreter

Simplifying Vulkan Synchronization

Police identify seven as main suspects in Post Office Horizon scandal inquiry

The 90% Gravity Problem: Why We Tend to Quit Right Before the Finish Line

Show HN: Tic-Tac-Toe in Pure CSS (No JavaScript/HTML)

How I Lost My Career and Started Delivering Mail

Salesforce CEO Claims Half of the Company's Work Is Now Done by AI

An educational website for forex traders

From Side Project to 10k Monthly Users: My Lessons from Building a Dev Tool Solo

Scoop: Trump admin cuts contracts with scientific publishing giant

AIVocal-AI Podcast

Book Review: Developing Talent in Young People by Benjamin Bloom

The 90% Gravity Problem: Why We Tend to Quit Right Before the Finish Line

Show HN: Daf·thunk – open-source Editor for Prototyping Workflows on Cloudflare

Speeding up global DNS resolution by avoiding CNAMES

Show HN: Checkpoint K8s pods transparently (plain CPU or GPU accelerated) [video]

Language Transfer – The Thinking Method (free language courses)

Geodesy for the Layman (1984)

Show HN: StopAddict – Quit addictions with a clean, gamified tracker

GUI in Pure Rust

Show HN: I Made an Extension That Makes You Money

Rust 1.88.0 hits stable with let-chains support

You Don't Own the Word "Freedom"

Show HN: Zenta – Mindfulness for Terminal Users

Show HN: Zeptaframe – Open-source click-and-drag precision for AI video gen

DeepSeek R2 launch stalled as CEO balks at progress

Extending Anthropic's Agent Workflows with Recursive Planning

Show HN: 10x Kubernetes Cluster on Hetzner Cloud

Get AI-powered command suggestions **directly** in your zsh shell

Apple reveals complex system of App Store fees to avoid E.U. fine of 500M euros

Windows Resiliency Initiative: Building resilience for a future-ready enterprise

Why Go Rocks for Building a Lua Interpreter

Simplifying Vulkan Synchronization

Police identify seven as main suspects in Post Office Horizon scandal inquiry

The 90% Gravity Problem: Why We Tend to Quit Right Before the Finish Line

Show HN: Tic-Tac-Toe in Pure CSS (No JavaScript/HTML)

How I Lost My Career and Started Delivering Mail

Salesforce CEO Claims Half of the Company's Work Is Now Done by AI

An educational website for forex traders

From Side Project to 10k Monthly Users: My Lessons from Building a Dev Tool Solo

Scoop: Trump admin cuts contracts with scientific publishing giant

AIVocal-AI Podcast

Book Review: Developing Talent in Young People by Benjamin Bloom

The 90% Gravity Problem: Why We Tend to Quit Right Before the Finish Line

Show HN: Daf·thunk – open-source Editor for Prototyping Workflows on Cloudflare

Speeding up global DNS resolution by avoiding CNAMES

Get AI-powered command suggestions directly in your zsh shell

Get AI-powered command suggestions directly in your zsh shell