Show HN: Brainfuck to RISC-V JIT compiler written in Zig

5•0x000xca0xfe•1mo ago

Hi everybody,

this was my project to learn Zig and RISC-V+x86_64 assembly.

Not sure if anybody is actually interested in yet another Brainfuck compiler, so I'll just write up some random things I learned while building it!

- A primitive assembly stitching compiler is 10x faster than the interpreter. Did not expect that.

- The generated x86 code is really bad (e.g. it always uses 6 or 7 byte sized instructions with 32-bit immediates when there are much smaller ones) but it doesn't really matter. Good code generated by GCC and clang for transpiled Brainfuck->C is not much faster as it's bottlenecked by memory accesses anyways.

- Zig is pretty far along actually. You can make serious projects with it!

- But the community seems to like self-punishment. Unused parameters and variables are hard errors and there is no way to disable that even for debug builds. Makes quickly commenting out part of the code a real PITA.

- I've had a miscompilation due to std.mem.span being broken and two source code breaks going from Zig 0.13 to 0.15 (std.mem.page_size got removed and ArrayList.popOrNull as well).

- But arbitrary size integers are fantastic! And well-defined two's complement behaviour!

Here is for example the code that encodes the c.beqz instruction:

  /// Branch if Equal to Zero (compressed): c.beqz rs1', offset -> beq rs1, x0, offset
  pub fn c_beqz(text: *std.ArrayList(u8), rs1: RV_X, offset: i9) !void {
      std.debug.assert(is3BitReg(rs1));
      std.debug.assert(@mod(offset, 2) == 0);
      const imm: u9 = @bitCast(offset);
      const RV_CB = packed struct(u16) {
          op: u2,
          offset5: u1,
          offset1_2: u2,
          offset6_7: u2,
          rsd_rs1_: u3,
          offset3_4: u2,
          offset8: u1,
          funct3: u3,
      };
      const ins = RV_CB {
          .op = 0x1,
          .offset5 = @truncate(imm >> 5),
          .offset1_2 = @truncate(imm >> 1),
          .offset6_7 = @truncate(imm >> 6),
          .rsd_rs1_ = @truncate(@intFromEnum(rs1) - 8),
          .offset3_4 = @truncate(imm >> 3),
          .offset8 = @truncate(imm >> 8),
          .funct3 = 0x6,
      };
      try appendInstruction(text, u16, @bitCast(ins));
  }

This is really nice as all the exotic integer sizes are actually checked, too.

- Zig support for Windows is good. Porting the project to Windows was very easy.

- When the RISC-V registers are carefully chosen, almost all instructions could be compressed in this projects.

- Compressed instructions and good branching code (using the branch instructions directly when the jump range is small enough instead of branching over a larger jump instruction) did not noticeably change performance on real hardware (OrangePi RV2).

- But somehow QEMU got a massive boost from that. Not sure why exactly.

So, that's about it!

I hope at least something was interesting...

Comments

sylware•1mo ago

thumbs up for this project (everything RISC-V is usually).

I write rv64 assembly (nearly core only, without memory reservation instructions) and run it on x86_64 with a very small (x86_64 assembly written) interpreter.

And your are right, I have had thoughts about a "RISC-V" x86_64 compiler (but it will probably require some runtime unfortunately).

Hopefully, rv22+ hardware with ultra-performant µ-architecture and with the latest silicon process will happen sooner than we expect. One less PI toxic lock and cleaner, _really standard_ assembly (the end game of much software).

0x000xca0xfe•1mo ago

Yeah I can't wait for a performant RISC-V core. Runtime code generation is so easy for RISC-V. I have many ideas or projects where I'd like to use it but it feels kinda pointless when JITed RISC-V machine code on current hardware gets destroyed by any half-decent x86 PC or Mac running naive C code.

sylware•1mo ago

Well, here are the tricks: interpreted rv64 assembly will be "slow"... actually "slower" than x86_64 native code... but in many execution contexts, for many pieces of software, here the first trick: the "slow" interpreted rv64 assembly machine code will be... "fast" enough... The 2nd trick: I have control on my rv64 machine interpreter, and I can write native x86_64 acceleration assembly along side of a rv64 reference implementation (I planned to do just that for my CPU renderer in my wayland compositor... actually I have already AVX2 code for some of that, even though the sweet spot is AVX512, but don't have the hardware for this, yet).

And once we have this rv64 shiny hardware, certainly won't be a drop-in, but the distance to code will be minimal.

One important SDK thing: I am careful at using the smallest number of rv64 machine instructions (we tend to forget 'R' in "RISC-V" means 'R'educed...), and I use basic, really basic, C preprocessors instead of the assembler preprocessor in order to decouple the assembly code from a specific assembler preprocessor. I don't even use assembler pseudo-instructions, or ABI register names, neither compressed machine instructions.

On top of that: I don't use ELF, I use a super minimal executable/system interface dynamic shared library format of my own, omega idiotically simple, which I wrap in ELF binaries for transparent support. People have to come to realize, ELF complexity, for a executable/system interface dynamic shared library is utterly and completely obsolete, even a liability once you are looking for binary stability in time (cf games), proven over more than the last decade.

Transformers are the best equivalents of cognitive ability

Modern Java – A book teaches how to write modern and effective Java

Show HN: Microsoft official MCP for documentation and more

New Date("WTF") – How well do you know JavaScript's Date class?

Show HN: MailTion – AI-Powered Email Marketing for Businesses

Ask HN: How to combine work, entrepreneurship and having a life

The Rise and Fall of the Knowledge Worker

Upwork Suspended My Account Without Reason

Zuck Races to Build Godlike AI, Women and People of Color Aren't Invited

Why U.S. Geothermal May Advance, Despite Political Headwinds

Superpowers are real–these people are living proof

Show HN: Sage-AI – AI-Powered Writing Assistant

Ask HN: What are some non-standard ways to reduce the size of executable files?

Show HN: A simple old school news website

Better Software Conference (Casey Muratori on OOP)

Show HN: Free YouTube Tag Generator to Improve Video SEO

Building a Distributed Cache for S3

Bad Actors Are Grooming LLMs to Produce Falsehoods

The Open Source AI Definition 1.0

Leave Russia

Invisible Text

Jonathan Blow – Jai Demo and Design Explanation

Separation of storage and compute without a performance tradeoff

Development in Progress

Damn Small Link Forwarder (DSLF) – rust based bit.ly replacement

Systemd's Nuts and Bolts – A Visual Guide to Systemd

Attended Windsurf's Build Night 18 hours before founders joined Google DeepMind

Malware Found in Official GravityForms Plugin Indicating Supply Chain Breach

Google Glass Wasn't a Failure. It Raised Crucial Concerns

Milgram shock-study imaginal replication: how far do you think you would go?