I mean maybe this has been optimized for already and I don't know what I'm talking about but maybe someone with more knowledge about the kernel knows? Is this something we simply can't optimize for because of security implications?
Editing to add: this deduplication is one of the greatest upsides to dynamic linking. Common libs like libgcc and libc only have to exist in memory once and can stay in CPU caches, whereas if they were statically linked into every binary, each binary would have a copy of that library that wouldn't be shared with anything else and you'd waste a lot of memory.
Unices have been sharing executable memory between processes longer than there's been mmap for user space to do the same thing themselves. I remember seeing it in the 2BSD kernel for instance.
It might be commonly held convention, and thus, an assumption, in Linux (and, broadly, UNIX) but I don't think it's true inside VAX or even Windows, so I don't think it's a requirement.
Unless I've missed something (which is totally possible, this is not an area of OS design I've spent much time).
In fact, if you profile it, in the fork() + execve() model, execve() is far more expensive, because not only does it replace the old process with a new one, but it also involves running the dynamic linker, which opens, parses, and mmaps library files.
It still makes sense to get rid of the fork() overhead if you're going to throw away the cloned process state soon thereafter, but if you wanted to make process execution radically faster, rethinking the exec architecture would probably offer more significant gains.
This is just an example of I don't even know how many things a modern-day process will share from its parent.
By "complicated" I do not even remotely mean "unsolvable". I just mean that if you really dig down into what it means to "share nothing" in a modern operating system, it's a lot richer than it was back when fork+exec was a practical solution. There's a lot of fuzzy things that could go either way when you say "shares nothing".
Windows, for all its many, many faults, did not use fork+exec and instead mostly has options for how one creates a process. It wasn’t done elegantly, but it was the right decision.
It's weird to leave out a mention of copy-on-write - the optimisation that means that you don't copy over all the memory.
> ABSTRACT > The received wisdom suggests that Unix’s unusual combi- > nation of fork() and exec() for process creation was an > inspired design. In this paper, we argue that fork was a clever > hack for machines and programs of the 1970s that has long > outlived its usefulness and is now a liability. We catalog the > ways in which fork is a terrible abstraction for the mod- > ern programmer to use, describe how it compromises OS > implementations, and propose alternatives. > As the designers and implementers of operating systems, > we should acknowledge that fork’s continued existence as > a first-class OS primitive holds back systems research, and > deprecate it. As educators, we should teach fork as a histor- > ical artifact, and not the first process creation mechanism > students encounter.
Every couple of years, someone claims they have "the solution" implying everyone else who came before them didn't know what they were doing.
> The kernel keeps track of which file is mapped where, and can detect when a request is made to map an already mapped file again, avoiding physical memory allocation if possible.
Relevant stack overflow answer: https://stackoverflow.com/questions/61950951/linux-shared-li...
In this case too, you think it is silly because you don't understand it. Your assumptions are wrong, making it seem silly.
ComputerGuru•49m ago
smj-edison•36m ago
sanderjd•28m ago