this was my project to learn Zig and RISC-V+x86_64 assembly.
Not sure if anybody is actually interested in yet another Brainfuck compiler, so I'll just write up some random things I learned while building it!
- A primitive assembly stitching compiler is 10x faster than the interpreter. Did not expect that.
- The generated x86 code is really bad (e.g. it always uses 6 or 7 byte sized instructions with 32-bit immediates when there are much smaller ones) but it doesn't really matter. Good code generated by GCC and clang for transpiled Brainfuck->C is not much faster as it's bottlenecked by memory accesses anyways.
- Zig is pretty far along actually. You can make serious projects with it!
- But the community seems to like self-punishment. Unused parameters and variables are hard errors and there is no way to disable that even for debug builds. Makes quickly commenting out part of the code a real PITA.
- I've had a miscompilation due to std.mem.span being broken and two source code breaks going from Zig 0.13 to 0.15 (std.mem.page_size got removed and ArrayList.popOrNull as well).
- But arbitrary size integers are fantastic! And well-defined two's complement behaviour!
Here is for example the code that encodes the c.beqz instruction:
/// Branch if Equal to Zero (compressed): c.beqz rs1', offset -> beq rs1, x0, offset
pub fn c_beqz(text: *std.ArrayList(u8), rs1: RV_X, offset: i9) !void {
std.debug.assert(is3BitReg(rs1));
std.debug.assert(@mod(offset, 2) == 0);
const imm: u9 = @bitCast(offset);
const RV_CB = packed struct(u16) {
op: u2,
offset5: u1,
offset1_2: u2,
offset6_7: u2,
rsd_rs1_: u3,
offset3_4: u2,
offset8: u1,
funct3: u3,
};
const ins = RV_CB {
.op = 0x1,
.offset5 = @truncate(imm >> 5),
.offset1_2 = @truncate(imm >> 1),
.offset6_7 = @truncate(imm >> 6),
.rsd_rs1_ = @truncate(@intFromEnum(rs1) - 8),
.offset3_4 = @truncate(imm >> 3),
.offset8 = @truncate(imm >> 8),
.funct3 = 0x6,
};
try appendInstruction(text, u16, @bitCast(ins));
}
This is really nice as all the exotic integer sizes are actually checked, too.- Zig support for Windows is good. Porting the project to Windows was very easy.
- When the RISC-V registers are carefully chosen, almost all instructions could be compressed in this projects.
- Compressed instructions and good branching code (using the branch instructions directly when the jump range is small enough instead of branching over a larger jump instruction) did not noticeably change performance on real hardware (OrangePi RV2).
- But somehow QEMU got a massive boost from that. Not sure why exactly.
So, that's about it!
I hope at least something was interesting...
sylware•4h ago
I write rv64 assembly (nearly core only, without memory reservation instructions) and run it on x86_64 with a very small (x86_64 assembly written) interpreter.
And your are right, I have had thoughts about a "RISC-V" x86_64 compiler (but it will probably require some runtime unfortunately).
Hopefully, rv22+ hardware with ultra-performant µ-architecture and with the latest silicon process will happen sooner than we expect. One less PI toxic lock and cleaner, _really standard_ assembly (the end game of much software).
0x000xca0xfe•4h ago
sylware•2h ago
And once we have this rv64 shiny hardware, certainly won't be a drop-in, but the distance to code will be minimal.
One important SDK thing: I am careful at using the smallest number of rv64 machine instructions (we tend to forget 'R' in "RISC-V" means 'R'educed...), and I use basic, really basic, C preprocessors instead of the assembler preprocessor in order to decouple the assembly code from a specific assembler preprocessor. I don't even use assembler pseudo-instructions, or ABI register names, neither compressed machine instructions.
On top of that: I don't use ELF, I use a super minimal executable/system interface dynamic shared library format of my own, omega idiotically simple, which I wrap in ELF binaries for transparent support. People have to come to realize, ELF complexity, for a executable/system interface dynamic shared library is utterly and completely obsolete, even a liability once you are looking for binary stability in time (cf games), proven over more than the last decade.