What is gVisor?

https://blog.yelinaung.com/posts/gvisor/

112•yla92•22h ago

Comments

ericpauley•22h ago

One of the coolest things about gVisor to me is that it's the ultimate implication of core computer engineering concepts like "the OS is just software" and "network traffic is just bytes". It's one thing to learn these ideas in theory, but it's another altogether to be able to play with an entire network stack in userspace and inject arbitrary behavior in the OSI stack. It's also been cool to see what companies like Fly.io and Tailscale can do with complete flexibility in the network, enabled by tools like gVisor.

sidewndr46•21h ago

I'm trying to understand the point you're making here but don't really get it. The OS is just software, in most circumstances. Most modern OS require at least one binary blob that has to be sent to some hardware device. This is mostly because the the device manufacturer didn't want to include NVRAM and at the end of the day is usually just software as well.

bananapub•21h ago

their point is that lots of things everyone thinks of as "OS" things like "tcp" and "doing file IO" can just be done in user space by some new program without the processes that make use of these facilities knowing or caring.

surajrmal•20h ago

The majority of any OS lives in user space though. Intercepting syscalls is also not that weird of an idea, that's how tools like strace works. Building out sufficient kernel functionality without needing to forward calls to the kernel is definitely impressive though.

tptacek•18h ago

What do you mean by that? There's a notion of an "operating system" that encompasses both the kernel and all the userland tools (in this sense, each Linux distribution is an "OS"), and there's a more common notion of an OS that is just the kernel and any userland services required for the kernel to function; the latter is the more common definition.

leetrout•21h ago

How does Fly use gVisor?

abound•21h ago

I don't believe they do, they use Firecracker microVMs for isolation: https://fly.io/docs/reference/architecture/

ericpauley•21h ago

https://fly.io/blog/ssh-and-user-mode-ip-wireguard/

PhilippGille•17h ago

Quote:

> And, long story short, we now have an implementation of certificate-based SSH, running over gVisor user-mode TCP/IP, running over userland wireguard-go, built into flyctl.

tptacek•16h ago

Also:

https://fly.io/blog/our-user-mode-wireguard-year/

https://fly.io/blog/jit-wireguard-peers/

This is another one of those things where the graph of our happiness about a technical decision is sinusoidal. :)

tptacek•20h ago

We don't.

jchw•20h ago

I think they mean to say that a part of gVisor is used by Fly, because if I recall correctly flyctl did use the gVisor user mode TCP stack for Wireguard tunneling.

tptacek•16h ago

Ahh, that makes sense. Ok, revised answer: yes, we do. :)

quotemstr•20h ago

Just wait until you read about Wine or captive NDIS. You'll probably enjoy User Mode Linux most of all.

The concept of an OS still makes sense on a system with no privilege level transitions and a single address space (e.g. DOS, FreeRTOS): therefore, mystical low level register goo isn't essential to the concept.

The boundary between the OS is a lot more porous and a lot less arcane than people imagine. In the end, it's just software.

jchw•20h ago

I believe early on Linode used UML for their VPS hosting offering. At that point in history, I recall solutions like OpenVZ being pretty popular in the low end space, too.

gVisor's modular design seems to have been its strongest point. It's not that nobody understood the OS is just software or whatever, but actually ripping the Linux TCP stack out and using it in userland isn't really that trivial. Meanwhile though a lot of projects have made use of the gVisor networking components, since they're pretty self-contained.

I think gVisor is one of the coolest things written in Go, although it's not really that easy to convey why.

Seriously, just check out the list of packages in the pkg directory:

https://pkg.go.dev/gvisor.dev/gvisor

(I should acknowledge, though, that I don't know of that many unique use cases for all of these packages; and while the TCP stack is very useful, it's mainly used for Wireguard tunneling and user mode TCP stacks are not particularly new. Still, the gVisor network stack is nicer than hacked together stuff using SLiRP-derived code imo.)

udev4096•21h ago

Moving to unikernel [0] is the best way to get strong isolation and high performance

[0] - https://unikraft.org

sidewndr46•21h ago

The last solution I looked at to do something like this was using tap / tun devices for networking. How does unikraft handle network isolation and virtualization?

udev4096•21h ago

From my limited understanding, it has the same isolation advantages as that of a VM and therefore it's as strong as the hypervisor you use

sidewndr46•20h ago

so does unikraft contain a "driver" for virtio networking?

johncolanduoni•21h ago

It relies on your hypervisor and/or network hardware to provide that. In an ideal circumstance (e.g. running on a multiqueue NIC with VFIO or virtio acceleration), your VM can talk directly to the network hardware. Major clouds will provide something morally equivalent via their newer network interfaces (gVNIC etc.).

mikepurvis•21h ago

Absolutely, that reduces your surface area more than anything else, but at an enormous cost to ergonomics.

Some of us are still fighting for docker images to not include a vim install ("but it's so handy!") and here we've got madlads building their app as its own bootable machine image.

johncolanduoni•21h ago

It’s not the best way to get low per-privilege domain overhead and fungible resource allocation. You’re ultimately limited by your hypervisor on those fronts. gVisor containers are ultimately a few Linux processes and mostly behave like one from a CPU and memory allocation perspective.

eyberg•21h ago

These people definitely do not understand security at all:

https://github.com/unikraft/unikraft/issues/414

Also - one needs to be careful cause many of the workloads they advertise on their site do not actually run under their kernel - it runs under linux which breaks a completely different type of trust barrier.

As for trust/full disclosure - I'm with nanovms.com

tkz1312•19h ago

they acknowledged the issue and the fix was merged in 2022, what exactly is the criticism here?

eyberg•18h ago

No it wasn't - you can still easily replicate. I just did.

My point is that you shouldn't go around talking about how "secure" you are when you have large gaping things like this. This btw is not the only major security issue they have.

udev4096•7h ago

Big fan of nanovms! I should have linked that instead, sorry

kang1•20h ago

not really, its just attack surface reduction

mikepurvis•21h ago

I love the concept of gVisor; it's surprising to me that it hasn't seemingly gotten more real world traction— even GHA is booting you a fresh machine for every build when probably 80%+ of them could run just fine in a gVisor sandbox.

I'd be curious to hear from someone at Google if gVisor gets a ton of internal use there, or it really was built mainly for GCP/GKE

seabrookmx•21h ago

Google Cloud Functions and Cloud Run both started as gVisor sandboxes and now have "gen2" runtimes that boot a full VM.

Poor I/O performance and a couple of missing syscalls made it hard to predict how your app was going to behave before you deployed it.

Another example of a switch like this is WSL 1 to WSL 2 on Windows.

It seems like unless you have a niche use case, it's hard to truly replicate a full Linux kernel.

kang1•20h ago

gvisor is difficult to implement in practice. it a syscall proxy rather than a virtualization mechanism (even thus it does have kvm calls).

This causes a few issues: - the proxying can be slightly slower - its not a vm, so you cannot use things such as confidential compute (memory encryption) - you can't instrument all syscalls, actually (most work, but there's a few edges cases where it wont and a vm will work just fine)

On the flip side, some potential kernel vulnerabilities will be blocked by gvisor, while it wont in a vm (where it wouldnt be a hypervisor escape, but you'd be able to run code as the kernel).

This is to say: there are some good use cases for gVisor, and there's less of these than for (micro) vms in general.

Google developed both gVisor and crosvm (firecracker and others are based on it) and uses both in different products.

AFAIK, there isn't a ton of gVisor use internally if its not already in the product, though some use it in Borg (they have a "sandbox multiplexer" called vanadium where you can pick and choose your isolation mechanism)

coppsilgold•19h ago

It's not an actual [filtering] proxy. It re-implements an increasing chunk of Linux syscalls with its own logic. It has to invoke some Linux syscalls to do so but it doesn't just pass them through.

tptacek•18h ago

I don't think this is really the case, if I'm reading it right. Can you think of a vulnerability hypo where a KVM host is vulnerable, but a gVisor host isn't? gVisor uses KVM.

dmoy•20h ago

We used gvisor in Kythe (semantic indexer for the monorepo). Like for the guts of running it on borg, not the open source indexers part.

For indexing most languages, we didn't need it, because they were pretty well supported on borg stack with all the Google internals. But Kythe indexes 45 different languages, and so inevitably we ran into problems with some of them. I think it was the newer python indexer?

> really was mainly for GCP/GKE

I mean... I don't know. That could also be true. There's a whole giant pile of internal software at Google that starts out as "built for <XYZ>, but then it gets traction and starts being used in a ton of other unrelated places. It's part of the glory of the monorepo - visibility into tooling is good, and reusability is pretty easy (and performant), because everyone is on the same build system, etc.

mikepurvis•16h ago

Dang, 45? I mean, I assume that's C++, Go, Python, Java, and JavaScript/TypeScript. And languages for build scripts, plus stuff like md and rst. And some shells. Probably embedded languages like lua, sql, graphql, and maybe some shading languages. Fortran and some assembly languages, a forth or two for low level bringup or firmware. Dart of course.

But all of those is still less than 30. What am I missing?

dmoy•10h ago

Three general categories missing:

1. The core stack of internal (or internally created but also external) - protobuf, gcl, etc

2. Some more well-known languages that aren't as big in Google, but are still used and people wrote indexers for: C#, lisp, Haskell, etc.

3. All the random domain specific langs that people built and then worte indexers for.

There's a bunch more that don't have indexers too.

gowld•21h ago

What in this article is different for the gvisor intro docs (where the gVisor pictures are plagiarized from)? https://gvisor.dev/docs/

setheron•21h ago

Is gVisor a libc LD_PRELOAD ?

kang1•20h ago

no ;) (though you could start it there if you wanted, but.. why)

LD_PRELOAD simply loads a library of your choice that executes code in the process context, that's all. folks usually do this when they cannot recompile or change the running binary, which means they also hook and/or overwrite functions of the said program.

generally folks will have gvisor calls integrated to their sandbox code, before the target process starts, so no need for preloading anything in most cases

lanigone•20h ago

ask chatgpt to run dmesg via python and you’ll find another use of gvisor in prod…

sneak•20h ago

I have wondered for a long time why we don’t see more networking in userspace for high security applications that don’t require high performance. I guess the answer is just that Linux has enough features now to hook into the kernel with userspace code that it usually isn’t necessary to move the whole IP and TCP stacks out.

illamint•20h ago

gVisor also has a complete userspace networking stack that you can pull in, which makes it a lot easier to do some neat things like run an HTTP server responding to packets intercepted via eBPF and sent to an AF_XDP socket, which would otherwise be a pain.

tptacek•18h ago

There's a separately-maintained fork of this (originally by the Tailscale folks) at https://pkg.go.dev/inet.af/netstack.

spr-alex•20h ago

We're adding support to gvisor for container plugins, it's a reasonable approach for limiting the rich attack surface on linux

remram•16h ago

Who is "we"? What are "container plugins"?

thundergolfer•19h ago

We've run gVisor for over 2 years at Modal, and it's been a huge unlock for us. We get a secure sandbox with GPU support that can run on VMs. Just recently it allowed us to checkpoint/restore containers AND its GPUs[1].

gVisor's achilles heel is it's missing or inaccurate syscalls, but the gVisor team is first class in responding to Github issues so it's really quite manageable in practice if you know how to debug and hack on a userspace kernel.

1. https://news.ycombinator.com/item?id=44747116

ignoramous•19h ago

> userspace kernel

Is gVisor a Kernel or a syscall + select subsystems (like network/gpu) proxy? In my head, a monolith Kernel (like Linux) does more than just syscalls (like memory management, device management, filesystems etc).

peterldowns•18h ago

In the past I'd heard people recommend against gVisor, and recommend looking at firecracker instead, because of I/O overhead. Is that something you've noticed at Modal? Obviously you're happy with gVisor, not suggesting you switch, just curious about your experience.

tptacek•18h ago

How are you handling the GPU isolation? (This was a big challenge for us doing AMD-Vi KVM isolation).

Nican•17h ago

Microsoft's blog post on Hyperlight got my attention a while ago: https://opensource.microsoft.com/blog/2025/02/11/hyperlight-...

I am way out of my depth here, but can anyone make a comparison with the "micro virtual machines" concept?

eyberg•17h ago

microvms as espoused by things like firecracker offer full machines but have tradeoffs like no gpu (which makes it boot faster)

hyperlight shaves way more off - (eg: no access to various devices that you'd find via qemu or firecracker) it does make use of virtualization but it doesn't try to have a full blown machine so it's better for things like embedding simple functions - I actually think it's an interesting concept but it is very different than what firecracker is doing

laurencerowe•15h ago

TinyKVM [1] has similarities to the gVisor approach but runs at the KVM level instead, proxying a limited set of system calls through to the host.

EDIT: It seems that gVisor has a KVM mode too. https://gvisor.dev/docs/architecture_guide/platforms/#kvm

I've been working on KVMServer [2] recently which uses TinyKVM to run existing Linux server applications by intercepting epoll calls. While there is a small overhead to crossing the KVM boundary to handle sys calls we get the ability to quickly reset the state of the guest. This means we can provide per-request isolation with an order of magnitude less overhead than alternative approaches like forking a process or even spinning up a v8 isolate.

[1] Previous discussion: https://news.ycombinator.com/item?id=43358980

[2] https://github.com/libriscv/kvmserver

vlovich123•1h ago

How do you deal with the lack of performance optimizations for JIT code because there’s no warm up and the optimizer never runs?

Terence Tao's NSF grants suspended

Every satellite orbiting earth and who owns them (2023)

Slow

How to Secure a Linux Server

Releasing weights for FLUX.1 Krea

How Hyper Built a 1m-Accurate Indoor GPS

The anti-abundance critique on housing is wrong

PHP-ORT: Machine learning inference for the web

Living with an Apple Lisa [video]

QUIC for the kernel

Ubiquiti launches UniFi OS Server for self-hosting

MacBook Pro Insomnia

“No tax on tips” is an industry plant

Gemini Embedding: Powering RAG and context engineering

Show HN: I made a website that makes you cry

Many countries that said no to ChatControl in 2024 are now undecided

You might not need tmux

Programmers aren’t so humble anymore, maybe because nobody codes in Perl

Pride Versioning 0.3.0

Rao Reading Algorithm (2024)

Show HN: Mcp-use – Connect any LLM to any MCP

Show HN: Rewindtty – Record and replay terminal sessions as structured JSON

Raspberry Pi 5 Gets a MicroSD Express Hat

Show HN: AgentMail – Email infra for AI agents

Face it: you're a crazy person

Show HN: KubeForge – A GUI for Kubernetes YAMLs

Secuso – Our Farewell from Google Play

Scientists and engineers craft radio telescope bound for the moon

Launch HN: Gecko Security (YC F24) – AI That Finds Vulnerabilities in Code

The Math Is Haunted

Terence Tao's NSF grants suspended

Every satellite orbiting earth and who owns them (2023)

Slow

How to Secure a Linux Server

Releasing weights for FLUX.1 Krea

How Hyper Built a 1m-Accurate Indoor GPS

The anti-abundance critique on housing is wrong

PHP-ORT: Machine learning inference for the web

Living with an Apple Lisa [video]

QUIC for the kernel

Ubiquiti launches UniFi OS Server for self-hosting

MacBook Pro Insomnia

“No tax on tips” is an industry plant

Gemini Embedding: Powering RAG and context engineering

Show HN: I made a website that makes you cry

Many countries that said no to ChatControl in 2024 are now undecided

You might not need tmux

Programmers aren’t so humble anymore, maybe because nobody codes in Perl

Pride Versioning 0.3.0

Rao Reading Algorithm (2024)

Show HN: Mcp-use – Connect any LLM to any MCP

Show HN: Rewindtty – Record and replay terminal sessions as structured JSON

Raspberry Pi 5 Gets a MicroSD Express Hat

Show HN: AgentMail – Email infra for AI agents

Face it: you're a crazy person

Show HN: KubeForge – A GUI for Kubernetes YAMLs

Secuso – Our Farewell from Google Play

Scientists and engineers craft radio telescope bound for the moon

Launch HN: Gecko Security (YC F24) – AI That Finds Vulnerabilities in Code

The Math Is Haunted

What is gVisor?

Comments