https://reference.wolfram.com/language/ref/ExpressionTree.ht...
The only gotchas are: 1) time to get your head around 2) algorithmic complexity of the resulting solution.
Graph theory is probably the most fulfilling math application in the computer science. In a way, graph-based algorithms do magic similar to AI but in a fully determined manner. If you think about it more broadly, a graph resembles a subset of a neural network but only with {0, 1} weights.
Time of AI to know how to use tools, like a mathematical formal solver. Well, it is already done, but it is not LLM... soooo.. academics only?
Your brain is made of relatively simple cells. Even earthworms have neurons.
But emergent complexity of systems made of simple neurons is staggering! That's the point, I guess. Simple bricks made complex systems.
Knowledge is much less useful if it doesn't get applied.
To paraphrase, mathematicians believe that they are in "a field of discovery" with the implication of "discovery" being that mathematicians are discovering something that already exists.
He gives an example of how mathematicians believe that lambda calculus and other systems like set theory and graph theory are actually all the same thing as category theory. In a metaphoric way we have discovered the abstraction (category theory) and recognize that lambda calculus "implements the trait/interface"
The dilemma is: did we truly invent a theory of a pre-existing phenomenon? Or do we somehow map physics of the universe to a what a brain is capable of parsing?
https://www.youtube.com/watch?v=I8LbkfSSR58&list=PLbgaMIhjbm...
No, it is ubiquitously known as CSE: common subexpression elimination.
The original DAG representation of the abstract syntax, on the other hand, exhibits substructure sharing.
> Of course, that invariant eventually changed. We added a way in the source langauge to introduce lets, which meant my algorithm was wrong.
Because you have to identify variables by more than just their symbol, due to shadowing, like De Brujn indexing and other schemes.
CSE is not particularly difficult in a functional language. Variable assignments throw a monkey wrench into it. Any terms with side effects also.
By the way, CSE can be done over intermediate representations, rather than abstract syntax. In an intermediate representation, you look for identical instructions with the same operands, not arbitrarily large expressions, while paying attention to variable liveness.
Another by the way is that not only do we have to worry about side effects, but we also cannot do CSE on expressions that guarantee returning a fresh object. E.g. if we are compiling Lisp we cannot merge two occurrences of (cons 1 2). The expressions must produce fresh cells which are not eq to each other. Ultimately that is linked to side effects; being able to mutate one cell with no effect on the other. Construction per se is not a side effect, even if it guarantees freshness.
tekknolagi•1mo ago
j2kun•1mo ago
kldx•1mo ago
SSA transformations are essentially equivalent to what the author appears to be doing in terms of let-bindings [0].
[0] https://dl.acm.org/doi/10.1145/278283.278285
sjurba•1mo ago
I love this fact about the internet! Thanks guys! Keep it up! Including the snarkyness. It’s part of what makes it great!
(I am aware this is not a novel idea. Posting the wrong solution is better than asking for help.. It is just fun to see it in action)
tekknolagi•1mo ago
tylerhou•1mo ago
The canonical algorithm to do that is to compute the dominance relation. A node X dominates Y if every path to Y must go through X. Once you have computed the dominance relation, if a common subexpression is located at nodes N1, N2, N3, you can place the computation at some shared dominator of N1, N2, and N3. Because dominance is a statement about /all/ paths, there is a unique lowest dominator [1]. This is exactly the "lowest single common ancestor."
Note that dominance is also defined for cyclic graphs. There may be faster algorithms to compute dominance for acyclic graphs. Expressions in non-lazy programming languages are almost always acyclic (e.g. in Haskell, you can write cyclic expressions).
[1] Claim. Let A, B, and C be reachable nodes. Suppose A and B both dominate C. Then either A dominates B or B dominates A.
Proof. We prove the contrapositive. If neither A dominates B nor B dominates A, then there exist paths a, b from the root such that path a passes through A but not B and path b passes through B but not A. If there is no path from A to C, then A cannot dominate C as C is reachable. Similarly, if there is no path from B to C, then B cannot dominate C. So assume there are paths a' from A to C and b' B to C. Then the path b.b' witnesses that A does not dominate C, and the path a.a' witnesses that B does not dominate C.
(There might be a bug in the proof; I think I proved something too strong, but I'm going to bed.)
mananaysiempre•1mo ago
The algorithm in the article does O(1) queries with O(V+E) preprocessing (assuming linear-preprocessing LCA, which, yeah). What’s the best algorithm for dominator trees? People usually talk about Lengauer–Tarjan[2], which is linear in practice (linear except for UNION-FIND), and not the linear one by Georgiadis[3,4]. Unfortunately, I’m not a compiler person.
[1] https://doi.org/10.1016/j.ipl.2010.02.014
[2] https://maskray.me/blog/2020-12-11-dominator-tree
[3] https://www.cs.princeton.edu/research/techreps/43
[4] https://dl.acm.org/doi/abs/10.1137/070693217
chc4•1mo ago
mananaysiempre•1mo ago
chc4•1mo ago
tylerhou•1mo ago
For an example, consider a three-node binary tree where R is the root, A is the left child, and B is the right child. A valid topological sort is R A B, but it is not the case that whenever B is computed, A has already been computed.
tylerhou•1mo ago
kazinator•1mo ago
I don't think you necessarily have to compute the dominance relation because it can pop out implicitly.
If CSE is done on intermediate code, the dominance relation will pop out from thedirecton in which the instructions are followed around the basic blocks.
E.g. simple case: we see t3 <= t2 + t1. So we make a note in some CSE hash table that we had a t2 + t1, and that the result is cached in t3. Then if we see t2 + t1 again in the same basic block, we can replace that with t3. The dominance relation is that earlier instructions in the basic block dominate later ones, but we don't have to explicitly calculate it.