Building a Minimal Viable Armv7 Emulator from Scratch

https://xnacly.me/posts/2025/building-a-minimal-viable-armv7-emulator/

75•xnacly•2mo ago

Comments

anthk•2mo ago

That's reminds me of the minimal MIPS1 interpreter written in Perl able to run old statically linked Linux binaries:

https://github.com/pts/pts-mips-emulator

thesnide•2mo ago

that's an excellent approach to ancient binaries!

dosinga•2mo ago

At first sight I read Minimal Viable Army - would be a good read for an aspiring evil mastermind

syntex•2mo ago

I see that author decorating webiste for Christmas :)

costco•2mo ago

Nice article and especially so for including the parsing that most people just outsource. What's great about using an emulator is that you can also do fun things with the syscalls like implementing your own "virtual filesystem" instead of just translating directly to the x86_64 equivalent syscall: https://github.com/gamozolabs/fuzz_with_emus/blob/master/src... (not my code but basically something like this)

bArray•2mo ago

I had to disable JS, the site is unreadable otherwise.

xnacly•2mo ago

The snowflakes do vanish once you scroll down even a single pixel

MisterTea•2mo ago

They stop spawning but the flakes already on the screen continue to fall until they reach the bottom. It's cute but very annoying IMO and there should be a user control to instantly turn it off.

xnacly•2mo ago

Its winter, it takes a literal second to fade out

MisterTea•2mo ago

They do not fade out in a second and continue to fall for 10+ seconds until they fade at the bottom of the page. It is also not winter everywhere.

postalrat•2mo ago

It is on that page

pm215•2mo ago

For a learning project like this one, this would probably be overkill, but my personal suggestion for instruction decoding is that it really pays in the long term to use a data driven decoder. It's fairly easy to do a handcoded "if bits A..B are 0b1000 and..." decoder for the basic integer parts of the instruction set, but especially as you get into complexities like SIMD and if you need your decoder to be easy to modify to add new instructions later this gets very unwieldy.

QEMU switched to a data driven representation with a python program to autogenerate the "check bit patterns and extract fields" code, and it's one of the better design overhauls we've done: we started using it mostly for new code but went back and converted some of the old handwritten decoders too. It's much easier to add a new instruction when you only need to add a line like

  USADA8           ---- 0111 1000 rd:4 ra:4 rm:4 0001 rn:4

and add a function trans_USADA8() that gets called with the field values, compared to trying to find the right place in a big existing set of handcoded switch and if statements to add the extra checks for the insn.

xnacly•2mo ago

I think rust macros could shine for this usecase, definitely on my TODO list

ethin•2mo ago

Did you or anyone else figure out how to work with the Aarch64/ARM32 open-source JSON schemas? I could never figure them out and I feel like if I wanted to work with them I'd have to manually write Pydantic models for them or something, because Quicktype chokes on them. Mainly because the schemas are recursive. If there's one thing I like about RISC-V, it's that their riscv-opcodes instruction dictionaries are trivial to work with (although it's tricky for me to auto-generate "this operand is signed, this one isn't" logic since the repo currently doesn't convey that information).

pm215•2mo ago

I haven't looked at them, partly because I'm sceptical about how much use they are. In my experience the hard part of emulating new instructions is the semantics, the "what does this instruction actually do?" bit. The line in the decode file that specifies the 1s and 0s and fields is trivial and takes no time. Plus, the way the architecture splits things up often doesn't match the way that makes the emulation simpler: for instance "same operation, but comes in signed and unsigned flavours" is often encoded with a sign bit but appears in the architecture as two separate instructions, one with an S in the name and one with a U. It's often simpler to have one decode line which covers both and passes signedness as an argument, where something auto-generated from the architectual data would give you two distinct functions.

For cases where everything you need is really in the datafiles (e.g. a simple disassembler) or where you're providing a user facing API that you want to have match the architecture documentation closely, the tradeoffs are different.

Also for QEMU I tend to value "works the same regardless of target architecture" over "we can do a clever thing for this one case but all the others will be different".

daeken•2mo ago

So, it is possible to use the machine consumable forms of the ISA, but realistically you'll spend way longer fighting it than finding other strategies.

Years ago, I ended up creating my own aarch64 definition, which I use to generate disassemblers, interpreters, and recompilers (dynamic and static) automatically: https://github.com/daeken/SharpRetro/blob/main/Aarch64Genera...

It doesn't have perfect support, but it has served as an incredibly useful resource. I've since generalized it to work for other architectures, and that same repo has definitions for MIPS (specifically the PSX CPU), DMG, and the groundwork for an x86 core. The goal is to be able to define these once, then generate any future targets automatically.

retrac•2mo ago

Getting a bit off topic, but I feel like this task is something that ought to have special language support.

It's a kind of serialization/deserialization, or what I think Python and some others call "pickling". Same task. Turn these raw bit patterns into typed values.

Ada probably comes closest of the major languages to pulling it off. It has separation of the abstract/programmer's view of a data type and the implementation / low representation of that type.

Specify a bunch of records like:

    for Instruction use record
       Condition at 0 range 31 .. 28;
       ImmFlag at 0 range 27 .. 27;
       Opcode at 0 range 24 .. 21;
       CondFlag at 0 range 20 .. 20;
       Rn at 0 range 19 .. 16;
       Rd at 0 range  15 .. 12;
       Operand at 0 range 11 .. 0;
    end record;

Then aim a pointer at your instructions and read them as records/structs.

It works particularly cleanly with a nice RISC encoding like ARM. I'm not actually sure if that would work in Ada. The use representation syntax might not be generic enough.

duskwuff•2mo ago

That sort of approach works for some very simple instruction encodings, but doesn't really handle:

1) instructions which "bend" the format, like ARM instructions such as STMIA or B which combine multiple fields to make a larger immediate value or mask.

2) recognizing instructions which use special values in fields (like ARM condition = 1111) to represent a special instruction.

3) instruction encodings with split fields, like the split immediate in RISC-V S-type instructions.

4) instruction encodings which have too many instruction-specific quirks to fit into any reasonable schema, like 68000.

pm215•2mo ago

If you think Arm is a "nice RISC encoding" then I think you've mostly been looking at the older integer bits of it :-) As you get into FP and SIMD there are just a lot more useful operations that need to fit into the strictly limited encoding space, and new features that need to be tucked into previously unused corners of the space, and it all gets noticeably less regular (e.g. "these two bits encode the operand size which is 0b00/0b01/0b10 for 8/16/32 bits, but 64 bit operands aren't supported and 0b11 means it's part of an entirely different set of instructions").

general1465•2mo ago

Looks nice, but the massive performance hit will be from constant parsing of arguments of opcodes. If you are emulating relatively small binaries (i.e. embedded stuff with few megabytes of program), it is better to approach each instruction as an object, parse it once and then save it into RAM into some tree structure so you can quickly find given opcode by its binary representation and then see if it has been parsed or not yet.

gorgoiler•2mo ago

I built an emulator in Python with my students, aged 15 and 16. All it did was have a little chunk of memory, PC, some registers, arithmetic, and branching, but it was really fun and a surprisingly tractable project to work on. Once you have a working “machine” with an assembler you can then start to automate function call and return which is of course a large part of what a compiler does for you.

With hindsight I would have loved to do a proper compiler but that’s undergrad level really. I really recommend it as a toy post-food-coma project for when you’re stuck with the family either next week or at the end of December :)

France's homegrown open source online office suite

British drivers over 70 to face eye tests every three years

Start all of your commands with a comma (2009)

Hoot: Scheme on WebAssembly

Stories from 25 Years of Software Development

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

Leisure Suit Larry's Al Lowe on model trains, funny deaths and Disney

First Proof

Reinforcement Learning from Human Feedback

The Waymo World Model

Coding agents have replaced every framework I used

Vocal Guide – belt sing without killing yourself

Software Factories and the Agentic Moment

72M Points of Interest

Unseen Footage of Atari Battlezone Arcade Cabinet Production

A Fresh Look at IBM 3270 Information Display System

StrongDM's AI team build serious software without even looking at the code

Ga68, a GNU Algol 68 Compiler

Making geo joins faster with H3 indexes

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Monty: A minimal, secure Python interpreter written in Rust for use by AI

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Hackers (1995) Animated Experience

Sheldon Brown's Bicycle Technical Info

Show HN: I spent 4 years building a UI design tool with only the features I use

What Is Ruliology?

Show HN: If you lose your memory, how to regain access to your computer?

An Update on Heroku

Microsoft open-sources LiteBox, a security-focused library OS

Google staff call for firm to cut ties with ICE

France's homegrown open source online office suite

British drivers over 70 to face eye tests every three years

Start all of your commands with a comma (2009)

Hoot: Scheme on WebAssembly

Stories from 25 Years of Software Development

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

Leisure Suit Larry's Al Lowe on model trains, funny deaths and Disney

First Proof

Reinforcement Learning from Human Feedback

The Waymo World Model

Coding agents have replaced every framework I used

Vocal Guide – belt sing without killing yourself

Software Factories and the Agentic Moment

72M Points of Interest

Unseen Footage of Atari Battlezone Arcade Cabinet Production

A Fresh Look at IBM 3270 Information Display System

StrongDM's AI team build serious software without even looking at the code

Ga68, a GNU Algol 68 Compiler

Making geo joins faster with H3 indexes

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Monty: A minimal, secure Python interpreter written in Rust for use by AI

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Hackers (1995) Animated Experience

Sheldon Brown's Bicycle Technical Info

Show HN: I spent 4 years building a UI design tool with only the features I use

What Is Ruliology?

Show HN: If you lose your memory, how to regain access to your computer?

An Update on Heroku

Microsoft open-sources LiteBox, a security-focused library OS

Google staff call for firm to cut ties with ICE

Building a Minimal Viable Armv7 Emulator from Scratch

Comments