Incorrect, they were authors of lex. yacc was authored by Stephen Johnson.
Surprising to me is all the authors are still around, even though the tools are over 50 years old!. Shows how young computer science field is.
I don't know if it's worth mentioning, but the author of the post is David Singleton, the former CTO of Stripe. I almost hadn't noticed until I saw the domain.
IIRC, and man, maybe I'm making it up, but, lore was he always made time on a regular schedule to hack.
Usually 1 layer from the bottom isn't coding so much anymore.
(oddly, I didn't realize he was *CTO* of Stripe until a few months back, when his new thing with Hugo Barra was announced)
Let’s make a Teeny Tiny compiler https://austinhenley.com/blog/teenytinycompiler1.html
I think the closest modern equivalents might be Python (for easy onramp and scalability from microcontrollers to supercomputers) and JavaScript (for pure ubiquity in every device with a web browser.)
I wonder if there is a modern-ish (?) environment that can match Visual BASIC in terms of easy GUI app programming. Perhaps Python or Tcl with Tk (Qt seems harder) or maybe Delphi, or perhaps a modern Smalltalk.
Advanced BASICs are too big for that, and in less advanced ones you get to POKE the hardware to do certain things. Which means you get to learn a bunch of hardware and machine code. That's not all bad though!
I am really glad that I only got to learn C, after getting through Turbo Basic, Quick Basic, Turbo Pascal[0], doing exactly the same kind of stuff urban myths say it was only possible after C came to be.
[0] - On 16 bit systems, I started coding on an 8bit Timex 2068.
Don't we all? ;-)
Fun to see this post from the deep archive get some interest - thanks for reading!
I write mine all by hand. It's the easiest part of a compiler to write, by far. It's also the least troublesome.
One advantage of doing them by hand is better, more targeted error messages are easier to fold in.
Recursive descent is surprisingly ergonomic and clean if one gets the heuristics right. Personally I find it way easier than writing BNF and its derivatives as you quickly get into tricky edge cases, slow performance and opaque errors.
Same with parser combinators. Not until a bunch of trial and error do you build up the intuitions you need to use them in production, I think.
Despite two decades of using those, I've found it much simpler to write my own scanning or RD parser.
Much of the parser code, if you compare the code with the BNF grammar, is a 1:1 correspondence. Super easy to do.
It's also the most annoying if you're writing a new language. You want to iterate on its ideas, but can't do so until you have a parser done.
I've been designing a few language concepts over the past year, and it feels 80% of this time has been writing and debugging parsers; by the time I get to the meat of language design - the shape of its AST, the semantics of it - any small syntactic change means going back to update the lexer and parser stage. Doesn't help that I can't settle on a syntax.
BTW I first started with PEG, which are nice in theory, but I feel the separation of lexing and parsing stage to very helpful to reduce boilerplate (handling whitespace in PEG is obnoxious). Later, I hand-wrote my parsers (in C), but it's gotten so repetitive I've dedicated a weekend to just learning lex/yacc (actually flex/bison). Even if parsers are easy to write, it's good to have higher level tools to reduce the tedium of it.
> You want to iterate on its ideas, but can't do so until you have a parser done.
Embed your language into host language. It is simple to do even in C++.Iterate on ideas till your heart content, then add syntax.
1. having to learn lex and yacc
2. running into limitations with lex and yacc
3. having a foreign program (i.e. lex and yacc) integrated into your build process
4. requiring a particular version of lex and yacc that may be awkward on multiple platforms
5. optimizing the code. (Once you start doing that, you cannot use lex/yacc to generate a new version.)
And so on.
At one point, we decided to write a D program that built the D compiler, instead of using make. Well, that turned out to be a greater time sink than just using make.
I suspect you might spend more time trying to specify what you want to the AI than just writing it.
Yacc/lex tend to produce generic error messages and do very poorly with error recovery.
I wrote a very small but complete compiler and VM for a very simple language: boolean expressions. I use it as a "what to expect" type of introduction during the first session of my compiler course.
The whole code is here, it is less than 150 lines of OCaml code (plus a few lines of C for the VM) and uses standard parsing tools: https://gist.github.com/p4bl0-/9f4e950e6c06fbba7e168097d89b0...
Where is the special tooling to help spot errors?
[2] https://mingodad.github.io/parsertl-playground/playground/ not sure.
But the same could be said about books, nothing is stopping you from writing a book except good ideas, story, and structure.
TMWNN•7mo ago
pxc•7mo ago
Here's the Wikipedia page for such things, which also taught me several other names for them:
https://en.m.wikipedia.org/wiki/Source-to-source_compiler
kragen•7mo ago
ratmice•7mo ago
meisel•7mo ago
andsoitis•7mo ago
fao_•7mo ago
An example off the top of my head — Chicken Scheme (call-cc.org) calls itself a compiler but it's target language is C
shakna•7mo ago
It's a subset. All transpilers are compilers. Not all compilers are transpilers.
[0] Amiga BASIC called itself a transcompiler, from memory.
fao_•7mo ago
tuveson•7mo ago
vrighter•7mo ago
They translate one language into another. The line between compiler/transpiler just doesn't make sense to me.
orthoxerox•7mo ago
If it translates a restricted subset of BASIC into Go, it doesn't really do anything beyond replacing one syntax with another.
vrighter•7mo ago
It translates one language into another, while maintaining logical correctness and its original semantics. Some cases are easier to do than others, but it doesn't change the nature of what is being done: language translation.
It isn't as simple as you think. Ex. in basic, the goto statement allows you to jump from anywhere to anywhere. Go has restrictions, such as it may not jump over variables coming into scope (being declared), nor can it jump into another scope, only outwards. So, for starters, this probably needs to scan the BASIC code for variables and hoist everything to the top as well as rewrite any code where it is trying to jump into a deeper scope (ex. jump from the topmost scope of a function into the body of a for loop).
Yes, it is a compiler. Maintaining semantics.
ethan_smith•7mo ago
khaledh•7mo ago
IMO, a better term for a compiler would have been a "translator."
[1] https://dl.acm.org/doi/pdf/10.1145/609784.609818
kragen•7mo ago
khaledh•7mo ago