I will just throw in some nostalgia for how good that compiler was. My college roommate brought an HP pizza box that his dad secured from HP, and the way the C compiler quoted chapter and verse from ISO C in its error messages was impressive.
The idea of optimizations running at different stages in the build, with different visibility of the whole program, was discussed in 1979, but the world was so different back then that the discussion seems foreign. https://dl.acm.org/doi/pdf/10.1145/872732.806974
https://github.com/solvespace/solvespace/issues/972
Build time was terrible taking a few minutes vs 30-40 seconds for a full build. Have they done anything to use multi-core for LTO? It only used one core for that.
Also tested OpenMP which was obviously a bigger win. More recently I ran the same test after upgrading from an AMD 2400G to a 5700G which has double the cores and about 1.5x the IPC. The result was a solid 3x improvement so we scale well with cores going from 4 to 8.
[1] Interestingly, GCC actually invokes Make internally to implement thin LTO, which lets it play nice with GNU Make's job control and obey the -j switch.
I never tried to implement them, finding it easier and more effective for the compiler to simply compile all the source files at the same time.
The D compiler is designed to be able to build one object file per source file at a time, or one object file which combines all of the source files. Most people choose the one object file.
http://mlton.org/WholeProgramOptimization
Dynamically linked and dynamically loaded libraries are useful though (paid for with its problems of course)
Because then you need to link them, thus you need some kind of linker.
Just generate one output file and skip the linker
1. linkers have increased enormously in complexity
2. little commonality between linkers for different platforms
3. compatibility with the standalone linkers
4. trying to keep up with constant enhancement of existing linkers
Of course, being C++, this subtly changes behavior and must be done carefully. I like this article that explains the ins and outs of using unity builds: https://austinmorlan.com/posts/unity_jumbo_build/
The D module design ensures that module imports are independent of each other and are independent of the importer.
The downside is that you can end up with thousands of object files, but for modern linkers that isn't a problem.
It's easy to dismiss a basic article like this, but it's basically a discovery that every Junior engineer will make, and it's useful to talk about those too!
Perhaps language designers thought that if a function needs to be inlined everywhere, it would lead to verbose code. In any case, it's a weak hint that compilers generally treat with much disdain.
So why do we still use the old way? LTO seems effectively like a hack to compensate for the fact that the compilation model doesn't fit our modern needs. Obviously this will never change in C/C++ due to momentum and backwards compatibility. But a man can dream.
securely_wipe_memory(&obj, sizeof obj);
return;
}
Compiler peeks into securely_wipe_memory and sees that it has no effect because obj is a local variable which has no "next use" in the data flow graph. Thus the call is removed.Another example:
gc_protect(object);
return
}
Here, gc_protect is an empty function. Without LTO, the compiler must assume that the value of object is required for the gc_protect call and so the generated code has to hang on to that value until that call is made. With LTO, the compiler peeks at the definition of gc_protect and sees the ruse: the function is empty! Therefore, that line of code does not represent a use of the variable. The generated code can use the register or memory location for something else long before that line. If the garbage collector goes off in that part of the code, the object is prematurely collected (if what was lost happens to be the last reference to it).Some distros have played with turning on LTO as a default compiler option for building packages. It's a very, very bad idea.
sakex•8h ago
vsl•8h ago