it's like systemd trading off non-determinism for boot speed, when it takes 5 minutes to get through the POST
It's possible to skip some of the ./configure steps. Especially for someone who knows the program very well.
Only a small subset of which could just switch to the parallel configure being proposed.
It's not parallel config, but it's pretty cool. In some ways it's better because when there's a hit it's instant.
It's quite synergistic with ccache as well. You navigate to a commit. You are informed that cash config info was retrieved successfully, so you skip configure and run make. All the objects are retrieved from ccache.
That's a bad analogy: if a given deterministic service ordering is needed for a service to correctly start (say because it doesn't start with the systemd unit), it means the non-deterministic systemd service units are not properly encoding the dependencies tree in the Before= and After=
When done properly, both solutions should work the same. However, the solution properly encoding the dependency graph (instead of just projecting it on a 1-dimensional sequence of numbers) will be more flexible: it's the better solution, because it will give you more speed but also more flexibility: you can see the branches any leaf depends on, remove leaves as needed, then cull the useless branches. You could add determinism if you want, but why bother?
It's like using the dependencies of linux packages, and leaving the job of resolving them to package managers (apt, pacman...): you can then remove the useless packages which are no longer required.
Compare that to doing a `make install` of everything to /usr/local in a specific order, as specified by a script: when done properly, both solutions will work, but one solution is clearly better than the other as it encodes more finely the existing dependencies instead of projecting them to a sequence.
You can add determinism if you want to follow a sequence (ex: `apt-get install make` before adding gcc, then add cuda...), or you can use meta package like build-essentials, but being restricted to a sequence gains you nothing.
given how complicated the boot process is ([1]), and it occurs once a month, I'd rather it was as deterministic as possible
vs. shaving 1% off the boot time
[1]: distros continue to ship subtlety broken unit files, because the model is too complicated
Linux runs all over, including embedded systems where boot time is important.
Optimizing for edge cases on outliers isn’t a priority. If you need specific boot ordering, configure it that way. It doesn’t make sense for the entire Linux world to sacrifice boot speed.
Competing POST in under 2 minutes is not guaranteed.
Especially the 4 socket beasts with lots of DIMMs.
For instance, the slow RAM check POST I was experiencing is because it was also doing a quick single pass memory test. Consumer firmware goes ‘meh, whatever’.
Disk spin up, it was also staging out the disk power ups so that it didn’t kill the PSU - not a concern if you have 3-4 drives. But definitely a concern if you have 20.
Also, the raid controller was running basic SMART tests and the like. Which consumer stuff typically doesn’t.
Now how much any of this is worthwhile depends on the use case of course. ‘Farm of cheap PCs’ type cloud hosting environments, most these types of conditions get handled by software, and it doesn’t matter much if any single box is half broken.
If you have one big box serving a bunch of key infra, and reboot it periodically as part of ‘scheduled maintenance’ (aka old school on prem), then it does.
Unfortunately no one has actually bothered to write down how systemd really works; the closest to a real writeup out there is https://blog.darknedgy.net/technology/2020/05/02/0/
I end up running it dozens of times when changing versions, checking out different branches, chasing dependencies.
It’s a big deal.
> it's like systemd trading off non-determinism for boot speed, when it takes 5 minutes to get through the POST
5 minute POST time is a bad analogy. systemd is used in many places, from desktops (that POST quickly) to embedded systems where boot time is critical.
If deterministic boot is important then you would specify it explicitly. Relying on emergent behavior for consistent boot order is bad design.
The number of systems that have 5 minute POST times and need deterministic boot is an edge case of an edge case.
Yeah... but neither of that is going to change stuff like the size of a data type, the endianness of the architecture you're running on, or the features / build configuration of some library the project depends on.
Parallelization is a bandaid (although a sorely needed!) IMHO, C/C++ libraries desperately need to develop some sort of standard that doesn't require a full gcc build for each tiny test. I'd envision something like nodejs's package.json, just with more specific information about the build details themselves. And for the stuff like datatype sizes, that should be provided by gcc/llvm in a fast-parseable way so that autotools can pick it up.
... I wonder if it's possible to manually seed a cache file with only known-safe test results and let it still perform the unsafe tests? Be sure to copy the cache file to a temporary name ...
---
I've thought about rewriting `./configure` in C (I did it in Python once but Python's portability turned out to be poor - Python2 was bug-free but killed; Python3 was unfixably buggy for a decade or so). Still have a stub shell script that reads HOSTCC etc. then quickly builds and executes `./configure.bin`.
if it's critical on an embedded system then you're not running systemd at all
> The number of systems that have 5 minute POST times and need deterministic boot is an edge case of an edge case.
desktop machines are the edge case, there's a LOT more servers running Linux than people using Linux desktops
> Relying on emergent behavior for consistent boot order is bad design.
tell that to the distro authors who 10 years in can't tell the difference between network-online.target, network-pre.target, network.target
amdahl's law's a bitch
I take you don't run DDR5?
This aspect of configure, in particular, drives me nuts. Obviously I'd like it to be faster, but it's not the end of the world. I forget what I was trying to build the other week, but I had to make 18 separate runs of configure to find all the things I was missing. When I dug into things it looked like it could probably have done it in 2 runs, each presenting a batch of things that were missing. Instead I got stuck with "configure, install missing package" over and over again.
Arguing against parallelization of configure is like arguing against faster OS updates. "It's only once a week/whatever, come on!" Except it's spread over a billion of people time and time again.
Also, I was surprised when the animated text at the top of the article wasn't a gif, but actual text. So cool!
(The conclusion I distilled out of reading that at the time, I think, was that this is actually sort of happening, but slowly, and autoconf is likely to stick around for a while, if only as a compatibility layer during the transition.)
This is how it was done: https://github.com/tavianator/tavianator.com/blob/cf0e4ef26d...
Wait is this true? (!)
The choices are:
1. Restrict the freedom of CPU designers to some approximation of the PDP11. No funky DSP chips. No crazy vector processors.
2. Restrict the freedom of OS designers to some approximation of Unix. No bespoke realtime OSes. No research OSes.
3. Insist programmers use a new programming language for these chips and OSes. (This was the case prior to C and Unix.)
4. Insist programmers write in assembly and/or machine code. Perhaps a macro-assembler is acceptable here, but this is inching toward C.
The cost of this flexibility is gross tooling to make it manageable. Can it be done without years and years of accrued M4 and sh? Perhaps, but that's just CMake and CMake is nowhere near as capable as Autotools & friends are when working with legacy platforms.
Man, if this got fixed it would be one of the best languages to develop for.
My wishlist:
* Quick compilation times (obv.) or some sort of tool that makes it feel like an interpreted language, at least when you're developing, then do the usual compile step to get an optimized binary.
* A F...... CLEAR AND CONSISTENT WAY TO TELL THE TOOLCHAIN THIS LIBRARY IS HERE AND THIS ONE IS OVER THERE (sorry but, come on ...).
* A single command line argument to output a static binary.
* Anything that gets us closer to the "build-once run-anywhere" philosophy of "Cosmopolitan Libc". Even if an intermediate execution layer is needed. One could say, "oh, but this is C, not Java", but it is already de facto a broken Java, because you still need an execution layer, call it stdlib, GLIB, whatever, if those shared libraries are not on your system with their exact version matching, your program breaks ... Just stop pretending and ship the "C virtual machine", lmao.
Fixing it would require an unprecedented level of cooperation across multiple industries.
At some point you realize that not even the (in)famous "host triple" has sufficient coverage for the fractal complexity of the real world.
C and its toolchain are where so many of these tedious little differences are managed. Making tools which account only for the 80% most common use-case defeats the purpose.
How would one encode e.g. "this library targets only the XYZ CPU in Q mode for firmware v5-v9 and it compiles only with GCC v2.95"?
The tools in use need to be flexible enough to handle cases like this with grace or you just end up reinventing GNU Autotools and/or managing your build with a hairy shell script of your own making.
The GNU Autotools system organically grew from dealing with the rapidly-changing hardware and software landscape of the 1980s-1990s. DIY shell scripts were organized, macros were written, and a system was born.
If embedded folk have to start writing their own scripts to handle the inevitable edge cases which WILL come up, then what is a new build tool really accomplishing-- Autotools, but in Python?
The co-evolution of hardware, software, and all other moving targets has landed us in a fairly abysmal local maxima. More recently developed toolchains (e.g. zig, rust, etc.) show us that there are much better ways to tackle these problems. Of course they introduce other ones, but we can do so much better.
But now the user has to set the preprocessor macro appropriately when he builds your program. Nobody wants to give the user a pop quiz on the intricacies of his C library every time he goes to install new software. So instead the developer writes a shell script that tries to compile a trivial program that uses function foo. If the script succeeds, it defines the preprocessor macro FOO_AVAILABLE, and the program will use foo; if it fails, it doesn’t define that macro, and the program will fall back to bar.
That shell script grew into configure. A configure script for an old and widely ported piece of software can check for a lot of platform features.
JS and Python wouldn't be what they are today if you had to `./configure` every website you want to visit, lmao.
You just gave me a flashback to the IE6 days. Yes, that's precisely what we did. On every page load.
It's called "feature detection", and was the recommended way of doing things (the bad alternative was user agent sniffing, in which you read the user agent string to guess the browser, and then assumed that browser X always had feature Y; the worst alternative was to simply require browser X).
Nice writeup though.
  x = 2 + 2
  y = 2 * 2
  z = f(x, y)
  print(z)
*And superficially off the topic of this thread, but possibly not.
You want bigger units of work for multiple cores, otherwise the coordination overhead will outweigh the work the application is doing
I think the Erlang runtime is probably the best use of functional programming and multiple cores. Since Erlang processes are shared nothing, I think they will scale to 64 or 128 cores just fine
Whereas the GC will be a bottleneck in most languages with shared memory ... you will stop scaling before using all your cores
But I don't think Erlang is as fine-grained as your example ...
Some related threads:
https://news.ycombinator.com/item?id=40130079
https://news.ycombinator.com/item?id=31176264
AFAIU Erlang is not that fast an interpreter; I thought the Pony Language was doing something similar (shared nothing?) with compiled code, but I haven't heard about it in awhile
A higher level language can be more opinionated, but a low level one shouldn't straight jacket you.
i.e. Rust can be used to IMPLEMENT an Erlang runtime
If you couldn't use threads, then you could not implement an Erlang runtime.
But yeah, I agree that we were promised a lot more automatic multithreading than we got. History has proven that we should be wary of any promises that depend on a Sufficiently Smart Compiler.
Maybe it would have been easier if CPU performance didn’t end up outstripping memory performance so much, or if cache coherency between cores weren’t so difficult.
The CPU has better visibility into the actual runtime situation, so can do runtime optimization better.
In some ways, it’s like a bytecode/JVM type situation.
For the trivial example of 2+2 like above, of course, this is a moot discussion. The commenter should've lead with a better example.
And when that happens, almost always the developer knows it is that type of situation and will want to tune things themselves anyway.
For CPU machine code it's the compilers doing the hard work of reordering code to allow ILP (instruction-level parallelism), eliminate false dependencies, inlining and vectorization; whatever else it takes to keep the pipeline filled and busy.
I'd love for the sentiment "the dev knows" to be true, but I think this is no longer the case. Maybe if you are in a low-level language AND have time to reason about it? Add to this the reserved smile when I see someone "benchmarking" their piece of code in a "for i to 100000" loop, without other considerations. Next, suppose a high-level language project: the most straightforward optimization to carry out for new code is to apply proper algorithms and fitting data structures. And I think this is too much to ask nowadays, because it takes time, effort, and knowledge of existence to remember to implement something.
At runtime, the CPU can figure it out though, eh?
    int buf_size = 10000000;
    auto vec = make_large_array(buf_size);
    for (const auto& val : vec)
    {
        do_expensive_thing(val);
    }
If I replace it with int buf_size = 10000000; cin >> buf_size; auto vec = make_large_array(buf_size); for (const auto& val : vec) { do_expensive_thing(val); }
the compiler could generate some code that looks like: if buf_size >= SOME_LARGE_THRESHOLD { DO_IN_PARALLEL } else { DO_SERIAL }
With some background logic for managing threads, etc. In a C++-style world where "control" is important it likely wouldn't fly, but if this was python...
    arr_size = 10000000
    buf = [None] * arr_size
    for x in buf:
        do_expensive_thing(x)
It doesn’t matter what people do or don’t do because this is a hypothetical feature of a hypothetical language that doesn’t exist.
You providing examples of why it totally-doesn’t-need-to-be-that-way are rather tangential, aren’t they? Especially when they aren’t addressing the underlying point.
this works best for scientific computing things that run through very big loops where there is very little interaction between iterations
I understand that yours is a very simple example, but a) such things are already parallelized even on a single thread thanks to all the internal CPU parallelism, b) one should always be mindful of Amdahl's law, c) truly parallel solutions to various problems tend to be structurally different from serial ones in unpredictable ways, so there's no single transformation, not even a single family of transformations.
[1] https://github.com/HigherOrderCO/Bend [2] https://github.com/VineLang/vine [3] https://en.wikipedia.org/wiki/Interaction_nets
Oddly enough, functional programming seems to be a poor fit for this because the fanout tends to be fairly low: individual operations have few inputs, and single-linked lists and trees are more common than arrays.
Instead of splitting the "configure" and "make" steps though, I chose to instead fold much of the "configure" step into the "make".
To clarify, this article describes a system where `./configure` runs a bunch of compilations in parallel, then `make` does stuff depending on those compilations.
If one is willing to restrict what the configure can detect/do to writing to header files (rather than affecting variables examined/used in a Makefile), then instead one can have `./configure` generate a `Makefile` (or in my case, a ninja file), and then have the "run the compiler to see what defines to set" and "run compiler to build the executable" can be run in a single `make` or `ninja` invocation.
The simple way here results in _almost_ the same behavior: all the "configure"-like stuff running and then all the "build" stuff running. But if one is a bit more careful/clever and doesn't depend on the entire "config.h" for every "<real source>.c" compilation, then one can start to interleave the work perceived as "configuration" with that seen as "build". (I did not get that fancy)
Just from a quick peek at that repo, nowadays you can write
#if __has_attribute(cold)
and avoid the configure test entirely. Probably wasn't a thing 10 years ago though :)
Covers a very large part of what is needed, making fewer and fewer things need to end up in configure scripts. I think most of what's left is checking for items (types, functions) existence and their shape, as you were doing :). I can dream about getting a nice special operator to check for fields/functions, would let us remove even more from configure time, but I suspect we won't because that requires type resolution and none of the existing special operators do that.
That said, that (determining the c flags and ld flags for dependencies) is something that might be able to be mixed into compilation a bit more than it is now. Could imagine that if we annotate which compilation units need a particular system library, we could start building code that doesn't depend on that library while determining the library location/flags (ie: running pkg-config or doing other funny business) at the same time.
Or since we're in the connected era, perhaps we're downloading the library we require if it's not found and building it as an embedded component.
With that type of design, it becomes more clear why moving as much to the build stage (where we can better parallelize work because most of the work is in that stage) and more accurately describing dependencies (so we don't block work that could run sooner) can be helpful in builds.
Doing that type of thing requires a build system that is more flexible though: we really would need to have the pieces of "work" run by the build system be able to add additional work that is scheduled by the build system dynamically. I'm not sure there are many build systems that support this.
This is already a problem for getting Bazel builds to run nicely under Nix, with the current solution (predownload everything into a single giant "deps" archive in the store and then treat that as a fixed input derivation with a known hash value) is deeply non-optimal. Basically, I hope that any such schemes have a well-tested fallback path for bubbling the "thing I would download" information outward in case there are reasons to want to separate those steps.
To some extent, the issue here is caused by just what I was discussing above: Nix derivations can't dynamically add additional derivations (ie: build steps not being able to dynamically add additional build steps makes things non-optimal).
I am hopeful that Nix's work on dynamic derivations will improve the situation for nix (with respect to bazel, cargo, and others) over time, and I am hopeful that other build systems will recognize how useful dynamically adding build steps can be.
In any case, the tvix devs have definitely understood the assignment on this and are making not only ifd a first class citizen, but also the larger issue of allowing the evaluation step to decompose, and for the decomposed pieces to run in parallel with each other and with builds— and that really is the game-changer, particularly with a cluster-backed build, to be able to start work immediately rather than waiting out a 30-60 second single-threaded eval.
    #if __has_attribute(cold)
    #if __has_attribute(__cold__)
    #  warning "This works too"
    #endif
    static void __attribute__((__cold__))
    foo(void)
    {
        // This works too
    }[1] https://www.gnu.org/savannah-checkouts/gnu/autoconf/manual/a...
A fairer criticism would be that they have no sense to use a more sane build system. CMake is a mess but even that is faaaaar saner than autotools, and probably more popular at this point.
This is peak engineering.
I'd like to think of myself as reasonable, so I'll just say that reasonable people may disagree with your assertion that cmake is in any way at all better than autotools.
There is no way in hell anyone reasonable could say that Autotools is better than CMake.
Autotools is terrible, but it's not the worst.
It really smelled of "oh I can do this better", and you rewrite it, and as part of rewriting it you realise oh, this is why the previous solution was complicated. It's because the problem is actually more complex than I though.
And then of course there's the problem where you need to install on an old release. But the thing you want to install requires a newer cmake (autotools doesn't have this problem because it's self contained). But this is an old system that you cannot upgrade, because the vendor support contract for what the server runs would be invalidated. So now you're down a rabbit hole of trying to get a new version of cmake to build on an unsupported system. Sigh. It's less work to just try to construct `gcc` commands yourself, even for a medium sized project. Either way, this is now your whole day, or whole week.
If only the project had used autotools.
CMake is easy to upgrade. There are binary downloads. You can even install it with pip (although recently the Python people in their usual wisdom have broken that).
The fundamental curse of build systems is that they are inherently complex beasts that hardly anybody has to work with full-time, and so hardly anybody learns them to the necessary level of detail.
The only way out of this is to simplify the problem space. Sometimes for real (by reducing the number of operating systems and CPU architectures that are relevant -- e.g. CMake vs. Autotools) and sometimes by somewhat artificially restricting yourself to a specific niche (e.g. Cargo).
If you only support x64 Linux and at least as new as latest Debian stable, then I don't feel like you should be talking about these things being too complex.
I don't laugh at plumbers for having a van full of obscure tools, when they just needed a wrench to fix my problem.
Choke is easy to upgrade on a modern system, maybe. But that defeats the point, you could just be upgraded normally then.
Or maybe, maybe an old Linux x86. But if that's all you were trying to support then what was the point of cmake in the first place.
It was a few years ago now, so I don't remember the scenario, but no it was absolutely not easy to install/upgrade cmake.
You complain about support for 90s compilers, but it's really helpful when you're trying to install on something obscure. Almost always autotools just works. Cmake, if it's not a Linux or Mac, good luck.
Comment is to old to edit, now.
Having done a deep dive into CMake I actually kinda like it (really modern cmake is actually very nice, except the DSL but that probably isn't changing any time soon), but that is also the problem: I had to do a deep dive into learning it.
In software you sometimes have to have the courage to reject doing what others do, especially if they're only doing it because of others.
Meh, I used to keep printed copies of autotools manuals. I sympathize with all of these people and acknowledge they are likely the sane ones.
That's what you get for wanting to use a glib function.
Simple projects: just use plain C. This is dwm, the window manager that spawned a thousand forks. No ./configure in sight: <https://git.suckless.org/dwm/files.html>
If you run into platform-specific stuff, just write a ./configure in simple and plain shell: <https://git.suckless.org/utmp/file/configure.html>. Even if you keep adding more stuff, it shouldn't take more than 100ms.
If you're doing something really complex (like say, writing a compiler), take the approach from Plan 9 / Go. Make a conditionally included header file that takes care of platform differences for you. Check the $GOARCH/u.h files here:
<https://go.googlesource.com/go/+/refs/heads/release-branch.g...>
(There are also some simple OS-specific checks: <https://go.googlesource.com/go/+/refs/heads/release-branch.g...>)
This is the reference Go compiler; it can target any platform, from any host (modulo CGO); later versions are also self-hosting and reproducible.
Even plain C is easier.
You can have a whole file be for OpenBSD, to work around that some standard library parts have different types on different platforms.
So now you need one file for all platforms and architectures where Timeval.Usec is int32, and another file for where it is int64. And you need to enumerate in your code all GOOS/GOARCH combinations that Go supports or will ever support.
You need a file for Linux 32 bit ARM (int32/int32 bit), one for Linux 64 bit ARM (int64,int64), one for OpenBSD 32 bit ARM (int64/int32), etc…. Maybe you can group them, but this is just one difference, so in the end you'll have to do one file per combination of OS and Arch. And all you wanted was pluggable "what's a Timeval?". Something that all build systems solved a long time ago.
And then maybe the next release of OpenBSD they've changed it, so now you cannot use Go's way to write portable code at all.
So between autotools, cmake, and the Go method, the Go method is by far the worst option for writing portable code.
> So now you need one file for all platforms and architectures where Timeval.Usec is int32, and another file for where it is int64. And you need to enumerate in your code all GOOS/GOARCH combinations that Go supports or will ever support.
I assume you mean [syscall.Timeval]?
    $ go doc syscall
    [...]
    Package syscall contains an interface to the low-level operating system
    primitives. The details vary depending on the underlying system [...].
But not only is syscall an example of portability done wrong for APIs, as I said it's also an example of it being implemented in a dumb way causing needless work and breakage.
Syscall as implementation leads by bad example because it's the only method Go supports.
Checking for GOARCH+GOOS tuple equality for portable code is a known anti pattern, for reasons I've said and other ones, that Go still decided to go with.
But yeah, autotools scripts often check for way more things than actually matter. Often because people copy paste configure.ac from another project without trimming.
Who is that for? Someone fuzztesting the kernel? You know what, if you're fuzztesting the kernel then maybe you can implement this yourself, instead of forcing needless unportability onto everyone who is not fuzztesting the kernel.
And when I say exceedingly lazy, I mean the comment in the offending file saying "// THIS FILE IS GENERATED BY THE COMMAND AT THE TOP; DO NOT EDIT".
Of course you could ask why I even need syscall.Select. One example is that I needed to check if a read() would block before reading. Shouldn't I instead use goroutines and a synchronous read? Maybe. Sometimes. But the file descriptor could have come from a library, and the read is in a callback, and leaving a pending read after returning from the callback could be undefined or a race condition.
Ok, so wrap it with os.NewFile, set a read deadline, try to read, then set it back. But "if the file descriptor is in non-blocking mode, NewFile will attempt to return a pollable File (one for which the SetDeadline methods work)". And it seems that NewFile "takes ownership" of the fd, closing it when the finalizer runs.
I guess I could Dup() it first, and handle all the edge cases to prevent fd leaks.
Dude, I just want to call select(). Not rely on if it's in non-blocking mode, and fight os.File.
Software with custom configure scripts are especially dreaded amongst packagers.
Because a different distribution is a different operating system. Of course, not all distributions are completely different and you don't necessarily need to make a package for any particular distribution at all. Loads of software runs just fine being extracted into a directory somewhere. That said, you absolutely can use packages for older versions of a distribution in later versions of the same distribution in many cases, same as with Windows.
> And to the boot, you don't have to go through some Microsoft-approved package distribution platform and its approval process: you can, of course, but you don't have to, you can distribute your software by yourself.
This is the same with any Linux distribution I've ever used. It would be a lot of work for a Linux distribution to force you to use some approved distribution platform even if it wanted to.
Do you speak from experience or from anecdotes ?
But consider FreeBSD. Contrary to Linux, it is a full, standalone operating system, just like Windows or macOS. It has pretty decent compatibility guarantees for each major release (~5 years of support). It also has an even more liberal license (it boils down to "do as you wish but give us credit").
Consider macOS. Apple keeps supporting 7yro hardware with new releases, and even after that keeps the security patches flowing for a while. Yet still, they regularly cull backwards compatibility to keep moving forward (e.g. ending support for 32-bit Intel executables to pave the way for Arm64).
Windows is the outlier here. Microsoft is putting insane amounts of effort into maintaining backwards compatibility, and they are able to do so only because of their unique market position.
Is there any such previous work?
Also, you shouldn’t need to run ./configure every time you run make.
Most checks are common, so what can help is having a shared cache for all configure scripts so if you have 400 packages to rebuild, it doesn't check 400 times if you should use flock or fcntl. This approach is described here: https://jmmv.dev/2022/06/autoconf-caching.html
It doesn't help that autoconf is basically abandonware, with one forlorn maintainer trying to resuscitate it, but creating major regressions with new releases: https://lwn.net/Articles/834682/
A far too common tragedy of our age.
Furthermore, there really has to be a better way to do what autotools is doing, no? Sure, there are some situations where you only have some bare sh shell and nothing much else but I'd venture to say that in no less than 90% of all cases you can very easily have much more stuff installed -- like the `just` task runner tool, for example, that solves most of the problems that `make` usually did.
If we are talking in terms of our age, we also have to take into account that there's too much software everywhere! I believe some convergence has to start happening. There is such a thing as too much freedom. We are dispersing so much creative energy for almost no benefit of humankind...
But I wanted the blog post sized version to be simpler for exposition.
Thinking in terms of technological problems, that should be a 100% solved problem at this point! Build a DAG of all tasks and just solve it and invoke stuff, right? Well, not exactly. A lot of the build systems don't allow you to specify if something is safe to execute in parallel. And that's important because sometimes even though it seems three separate tasks are completely independent and can be executed in parallel they'd still share f.ex. a filesystem-level cache and would compete for it and likely corrupt it, so... not so fast. :(
But I feel the entire tech sector is collectively dragging their feet on this. We should be able to devise a good build system that allows us to encode more constraints and requirements than the current ones, and then simply build a single solver for its DAG and be done with it. How frakkin difficult can that be?
Of course the problem is actually social, not technical. Meaning that most programmers wouldn't ever migrate if the decision was left to them. I still would not care about them though; if I had the free time and energy then I would absolutely work on it. It's something that might swallow a lot of energy but if solved properly (meaning it has to be reasonably extensive without falling into the trap that no two projects would even use the same implementation -- let us NOT do that!) then it will be solved only once and never again.
We can dream.
But again, it very much does seem like a very solvable problem to me.
And of course most of the time you don't need to rerun configure at all, just make.
  $ git bisect good
  Bisecting: 7 revisions left to test after this (roughly 3 steps)
  restored cached configuration for 2f8679c346a88c07b81ea8e9854c71dae2ade167
  [2f8679c346a88c07b81ea8e9854c71dae2ade167] expander: noexpand mechanism.
I primed the cache by executing a "git checkout" for each of a range of commits.
Going forward, it will populate itself.
This is the only issue I would conceivably care about with regard to configure performance. When not navigating in git history, I do not often run configure.
Downstream distros do not care; they keep their machines and cores busy by building multiple packages in parallel.
It's not ideal because the cache from one host is not applicable to another; you can't port it. I could write an intelligent script to populate it, which basically identifies commits (within some specified range) that have touched the config system, and then assumes that for all in-between commits, it's the same.
The hook could do this. When it notices that the current sha doesn't have a cached configuration, it could search backwards through history for the most recent commit which does have it. If the configure script (or something influencing it) has not been touched since that commit, then its cached material can be populated for all in-between commits right through the current one. That would take care of large swaths of commits in a typical bisect session.
For instance, if the only input to the configuration system is the body of the configure script, then we hash that. That is then our key to the generated materials.
epistasis•6mo ago
It's likely that C will continue to be used by everyone for decades to come, but I know that I'll personally never start a new project in C again.
I'm still glad that there's some sort of push to make autotools suck less for legacy projects.
tidwall•6mo ago
monkeyelite•6mo ago
Creating a make file is about 10 lines and is the lowest friction for me to get programming of any environment. Familiarity is part of that.
edoceo•6mo ago
monkeyelite•6mo ago
Autotools is going to check every config from the past 50 years.
charcircuit•6mo ago
No? Most operating systems don't have a separate packager. They have the developer package the application.
monkeyelite•6mo ago
viraptor•6mo ago
But if you don't plan to distribute things widely (or have no deps).. Whatever, just do what works for you.
psyclobe•6mo ago
aldanor•6mo ago
yjftsjthsd-h•6mo ago
kouteiheika•6mo ago
It can build a Rust program (build.rs) which builds things that aren't Rust, but that's an entirely different use case (building non-Rust library to use inside of Rust programs).
crabbone•6mo ago
touisteur•6mo ago
malkia•6mo ago
ahartmetz•6mo ago
malkia•6mo ago
Discovery is the wrong thing to do nowadays.
ahartmetz•6mo ago
At best, I think you could have a system that defers some / most dependency discovery until after configure time, but still aborts the build with "required libfoo >= 0.81.0 not found" if necessary.
And no, you are not going to be able to tell everyone exactly where everything needs to be installed unless it's an internal product.
malkia•6mo ago
JCWasmx86•6mo ago
torarnv•6mo ago
OskarS•6mo ago
tavianator•6mo ago
eqvinox•6mo ago
autoconf is in no way, shape or form an "official" build system associated with C. It is a GNU creation and certainly popular, but not to a "monopoly" degree, and it's share is declining. (plain make & meson & cmake being popular alternatives)