Signals vs. Query-Based Compilers

https://marvinh.dev/blog/signals-vs-query-based-compilers/

60•todsacerdoti•1mo ago

Comments

imvetri•1mo ago

This isn't good. A good and efficient is to build better code generator. Current stack of programming languages and platform is heavy.

Let's assume a state of computing where all computations are done and cached.

In such a system, the programming languages or transpilers become obselete, as the computer results are available from storage readily.

Now the challenge is different. Depending on the reader or the user, the same memory data will appear different.

The paradigm shifts from parsers, to readers.

Example, an image data appears as binary to computer person, and meanings to an artist, etc.

Conclusion: what crap I have commented!

recursive•4w ago

How do you cache the result of a computation if you eliminate the means of expressing that computation?

imvetri•3w ago

when computed once, the function signature will be cache key

example function name(12,321):result

There is no elimination, there is one time execution

aatd86•1mo ago

Just a little feedback: Incremental is good. That allows for partial recompilation. You also want to establish good compilation unit demarcation.

However wrt to query being different from signals, especially since you described it in terms of push/pull, the difference is not very clear. pull is basically your query, or you actively recompiling. push is handled when you take livereloading into account.

There is no real difference.

Now there are different algorithm to handle the graph traversal recomputations. If you want to do it once, I guess it needs to be depth ordered and you need to recompile level by level each compilation unit that changed. So that you visit only once.

juancn•4w ago

One thing that helps a lot is somewhat isolated compilation units.

That means that most of the rest of the project is largely irrelevant besides some basic type information.

If your parsing context is small enough, you may be able to run a mostly complete parsing pipeline on the current unit, and inject a "code completion" token, where you make predictions of what can come next at that point (any token behind it is usually irrelevant).

That way you could still do a mostly vanilla compiler with auto complete, but supporting larger operations require much more state for the project.

w10-1•4w ago

It's very helpful to see the big picture of pull vs push-pull and caching; often capabilities just get tacked on as needed, to eventually build a mess.

Before LSP, an earlier generation of java compilers circa 2001 (eclipse, then javac) supported incremental compilation and model queries. This effort extended into runtime hot-reload of compatible code (which was ambitious, but has mostly been limited to changing function bodies).

Here's a reddit post with a nice video on point with an excellent series of references (the author may have seen, but they didn't post references to what they read):

https://www.reddit.com/r/ProgrammingLanguages/comments/ge0s3...

The database framing and input caching you mention suggests a new compiler might benefit from using a database instead of in-memory trees and such. In particular, I wonder if datalog style database, with declared code as rules with type consequences, would help. (i.e., the general type rules stay, while declared instances show as new relations with consequence per the general rules). Often those datalog systems built via relation models (A -> B via R) have an extra revision field, and update just by issuing new revisions (i.e., without actually deleting the old), resulting in systems where you can backtrack in time (and don't pay for deletions/memory managment until needed). Such revision history might be helpful for calculating proposed fixes (by unwinding to a last-known-good state, and then changing subsequent edit-declarations/operations). However, all such databases I'm familiar now with force you to copy data to get results, and few have robust query caching or planning or extensible query functions. It would be interesting if someone purpose-built a relation and rules database for compilers.

refset•4w ago

> It would be interesting if someone purpose-built a relation and rules database for compilers

While not quite in rustc proper, along these lines: https://github.com/rust-lang/chalk + https://github.com/rust-lang/polonius

Permik•4w ago

Also related, not a DB but an incremental computation engine: https://github.com/salsa-rs/salsa

This can be found in things like the rust-analyzer and other actual compilers.

hugiex•4w ago

Great writeup on query-based compilers! though I think the signal comparison is a bit misleading. The defining feature of signal is automatic push propagation, change a source, updates flow through dependencies automatically. Query-based compilers don't have this at all. When an input changes, nothing happens until you explicitly re-query. What you've described is really sophisticated memoization with smart cache invalidation (revisions, one-way dependency tracking and stuff), not reactive propagation. Both solve incremental computation, but through fundamentally different mechanisms: push-based reactivity vs pull-based lazy evaluation.

Still a great post on modern compiler architecture though! many thanks

We Mourn Our Craft

Speed up responses with fast mode

Hoot: Scheme on WebAssembly

U.S. Jobs Disappear at Fastest January Pace Since Great Recession

Stories from 25 Years of Software Development

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

Al Lowe on model trains, funny deaths and working with Disney

The AI boom is causing shortages everywhere else

Brookhaven Lab's RHIC Concludes 25-Year Run with Final Collisions

The Waymo World Model

Reinforcement Learning from Human Feedback

Start all of your commands with a comma (2009)

I Write Games in C (yes, C)

Vocal Guide – belt sing without killing yourself

SectorC: A C Compiler in 512 bytes

France's homegrown open source online office suite

Coding agents have replaced every framework I used

A Fresh Look at IBM 3270 Information Display System

Selection Rather Than Prediction

History and Timeline of the Proco Rat Pedal (2021)

72M Points of Interest

Unseen Footage of Atari Battlezone Arcade Cabinet Production

Where did all the starships go?

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Learning from context is harder than we thought

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Monty: A minimal, secure Python interpreter written in Rust for use by AI

Making geo joins faster with H3 indexes

Software factories and the agentic moment