Well, it did and it works nicely. No arithmetic libraries, no PROCESS except for the DFF component (obviously). Of course it's a bit of a "resource hog" compared to optimized cores, (eg. the RAM is build out of flip flops instead of a block ram that takes advantage of FPGA intermal memory) but you can actually trace every signal through the datapath as it happens.
I also build an assembler in C99 without external libraries (please be forgiving, my code is very primitive I think). I bundled Sci1 (Scintilla), GHDL and GTKWave into a single installer so you can write assembly and see the waveforms immediately without having to spend hours configuring simulators. Currently Windows only, but at some point I'll have to do it on Linux too. I tested it on the Tang Primer 25K and Cyclone IV, and I included my Gowin, Quartus and Vivado projects files. That should make easy to run on your FPGA.
Everything is under the GPL3.
(Edit: I did not use AI. Not was it a waste of time for the VHDL because my design is too novel -- but even for beta testing it would waste my time because those LLMs are too well trained for x86/ARM and my flag logic draws from 6502/6800 and even my ripple carry adder doesn't flip the carry bit in subtraction. Point is -- AI couldn't help. It only kept complaining that my assembler's C code wasn't up to 2026 standards)