Rewriting Every Syscall in a Linux Binary at Load Time

https://amitlimaye1.substack.com/p/rewriting-every-syscall-in-a-linux

37•riteshnoronha16•4d ago

Comments

CableNinja•4d ago

I assume this would break observability through existing methods, right? If you were to strace a process that has been patched, would you see regular syscall data (as if it wasnt patched) or would your syscall replacement appear along the way?

amitlimaye•4d ago

Good question. I didn't cover this in the post — the binary doesn't run on the host kernel directly. It runs inside a lightweight KVM-based VM with no operating system. The shim is the only thing handling syscalls inside the guest. So strace on the host wouldn't see anything — no syscalls reach the host kernel from the guest. From the host side, the only visible activity is the hypervisor process making syscalls on behalf of the guest.

Inside the guest, there's no kernel to attach strace to — the shim IS the syscall handler. But we do have full observability: every syscall that hits the shim is logged to a trace ring buffer with the syscall number, arguments, and TSC timestamp. It's more complete than strace in some ways — you see denied calls too, with the policy verdict, and there's no observer overhead because the logging is part of the dispatch path.

So existing tools don't work, but you get something arguably better: a complete, tamper-proof record of every syscall the process attempted, including the ones that were denied before they could execute. I'll publish a follow-on tomorrow that details how we load and execute this rewritten binary and what the VMM architecture looks like.

coppsilgold•1h ago

You mentioned SECCOMP_RET_TRACE, but there is also SECCOMP_RET_TRAP[1] which appears to perform better. There is also KVM. Both of these are options for gVisor: <https://github.com/google/gvisor>

[1] <https://github.com/google/gvisor/blob/master/pkg/sentry/plat...>

monocasa•1h ago

There's also SECCOMP_RET_USER_NOTIF, which is typically used by container runtimes for their sandboxing.

coppsilgold•1h ago

SECCOMP_RET_USER_NOTIF seems to involve sending a struct over an fd on each syscall. Do they really use it? Performance ought to suffer.

Also gVisor (aka runsc) is a container runtime as well. And it doesn't gatekeep syscalls but chooses to re-implement them in userland.

foota•1h ago

Hah, I've been looking into something amusingly similar to track mmap syscalls for a process :)

jmillikin•1h ago

This might be a very dumb question, but if the process is being run under KVM to catch `int 0x03` then couldn't you also use KVM to catch `syscall` and execute the original binary as-is? I don't understand what value the instruction rewriting is providing here.

ozgrakkurt•48m ago

Really informative writing thank you.

How secure does this make a binary? For example would you be able to run untrusted binary code inside a browser using a method like this?

Then can websites just use C++ instead of javascript for example?

im3w1l•39m ago

What about int 80h?

JSR_FDED•29m ago

Love the detailed write up, thanks!

This is the kind of foundation that I would feel comfortable running agents on. It’s not the whole solution of course (yes agent, you’re allowed to delete this email but not that email can’t be solved at this level)… let me know when you tackle that next :-)

hparadiz•17m ago

I've been thinking of making a kernel patch that disables eBPF for certain processes as a privacy tool. Everyone is using eBPF now.

Category Theory Illustrated – Orders

Amiga Graphics

Show HN: I made a calculator that works over disjoint sets of intervals

Claude Design

The simple geometry behind any road

Measuring Claude 4.7's tokenizer costs

Towards trust in Emacs

All 12 moonwalkers had "lunar hay fever" from dust smelling like gunpowder (2018)

Rewriting Every Syscall in a Linux Binary at Load Time

Spending 3 months coding by hand

Michael Rabin Has Died

It is incorrect to "normalize" // in HTTP URL paths

A simplified model of Fil-C

Are the costs of AI agents also rising exponentially? (2025)

Brunost: The Nynorsk Programming Language

Show HN: Smol machines – subsecond coldstart, portable virtual machines

Slop Cop

Show HN: PanicLock – Close your MacBook lid disable TouchID –> password unlock

"cat readme.txt" is not safe if you use iTerm2

Hyperscalers have already outspent most famous US megaprojects

NASA Force

Middle schooler finds coin from Troy in Berlin

Landmark ancient-genome study shows surprise acceleration of human evolution

Making Wax Sealed Letters at Scale

Casus Belli Engineering

NIST gives up enriching most CVEs

Arc Prize Foundation (YC W26) Is Hiring a Platform Engineer for ARC-AGI-4

Introducing: ShaderPad

The Unix executable as a Smalltalk method (2025) [video]

The GNU libc atanh is correctly rounded

Rewriting Every Syscall in a Linux Binary at Load Time

Comments

Category Theory Illustrated – Orders

Amiga Graphics

Show HN: I made a calculator that works over disjoint sets of intervals

Claude Design

The simple geometry behind any road

Measuring Claude 4.7's tokenizer costs

Towards trust in Emacs

All 12 moonwalkers had "lunar hay fever" from dust smelling like gunpowder (2018)

Rewriting Every Syscall in a Linux Binary at Load Time

Spending 3 months coding by hand

Michael Rabin Has Died

It is incorrect to "normalize" // in HTTP URL paths

A simplified model of Fil-C

Are the costs of AI agents also rising exponentially? (2025)

Brunost: The Nynorsk Programming Language

Show HN: Smol machines – subsecond coldstart, portable virtual machines

Slop Cop

Show HN: PanicLock – Close your MacBook lid disable TouchID –> password unlock

"cat readme.txt" is not safe if you use iTerm2

Hyperscalers have already outspent most famous US megaprojects

NASA Force

Middle schooler finds coin from Troy in Berlin

Landmark ancient-genome study shows surprise acceleration of human evolution

Making Wax Sealed Letters at Scale

Casus Belli Engineering

NIST gives up enriching most CVEs

Arc Prize Foundation (YC W26) Is Hiring a Platform Engineer for ARC-AGI-4

Introducing: ShaderPad

The Unix executable as a Smalltalk method (2025) [video]

The GNU libc atanh is correctly rounded