Might be worth skipping to the interesting parts that aren’t in textbooks
https://github.com/inferno-os/inferno-os/blob/master/libinte...
There's lots of room to improve it, but it worked well enough to run on telephony equipment in prod.
Why compilers are hard – the IR data structure
If you claim an IR makes things harder, just skip it. Compilers do have an essential complexity that makes them "hard" [...waffle waffle waffle...]
The primary data [...waffle...] represents the computation that the compiler needs to preserve all the way to the output program. This data structure is usually called an IR (intermediate representation). The primary way that compilers work is by taking an IR that represents the input program, and applying a series of small transformations all of which have been individually verified to not change the meaning of the program (i.e. not miscompile). In doing so, we decompose one large translation problem into many smaller ones, making it manageable.
There we go. The section header should be updated to: Why compilers are manageable – the IR data structureAn early function inliner I implemented by inlining the IR. When I wrote the D front end, I attempted to do this in the front end. This turned out to be a significantly more complicated problem, and in the end not worth it.
The difficulty with the IR versions is, for error messages, it is impractical to try and issue error messages in the context of the original parse trees. I.e. it's the ancient "turn the hamburger into a cow" problem.
I was really, really angry that the review had not attempted to contact me about this.
But the other compiler venders knew what I'd done, and the competition implemented DFA as well by the next year, and the benchmarks were updated.
The benchmarks were things like:
void foo() { int i,x = 1; for (i = 0; i < 1000; ++i) x += 1; }
dhruv3006•12h ago