Path Isn't Real on Linux

https://blog.danielh.cc/blog/path

63•max__dev•3h ago

Comments

taraindara•3h ago

This actually helps explain some behaviors I’ve encountered. It was never a serious issue, since the answer is to use a full path. But is slightly annoying none the less. Understanding helps a lot.

inlets•3h ago

Why would the author think that the PATH environment variable is being used by the kernel? What an odd assumption.

quotemstr•2h ago

Well, execve(2) and execvp(3) are both "system" functions. C (which is already black magic for some people) invokes both by calling into functions exported from libc. If you're not super dorky^Wfamiliar with low-level systems stuff, you might guess that the two functions are implemented in the same place and in the same way. That the latter is just a libc wrapper around the former that does a PATH search is arcane detail you don't have to care about 99% of the time.

It's hard to appreciate how the world looks before you learn a fact. You can't unsee things.

Aloisius•2m ago

[delayed]

mynegation•2h ago

You and I and bunch of other people know it and take it to be self-evident, but someone discovered it (maybe recently, maybe they have known it for a while) and did the nice write up for people who had not have known that yet. https://xkcd.com/1053/

opello•1h ago

The lucky 10,000 is a positive take on the situation. But the article using "real," which I think would connote to "legitimate" to most, seems a little more polarizing that sharing a discovery.

mynegation•56m ago

Click bait title for sure

LegionMammal978•1h ago

One thing I was surprised to learn a couple years ago is that users and groups aren't really tracked much by the Linux kernel: they're just numeric IDs that track process and file ownership. So if you setuid() to a user ID that doesn't exist in /etc/passwd or anywhere else, the kernel won't stop you.

latchkey•1h ago

If I have a file on machineA with uid10001 and I copy the file to machineB, I might want it to retain that uid, but it shouldn't matter to machineB that it doesn't map to a real user.

MrDarcy•40m ago

You’ll see this observation all the time building containers.

MisterTea•1h ago

Ignorance leading to assumptions. Their eureka moment: "The shell, not the Linux kernel, is responsible for searching for executables in PATH!" makes it obvious they haven't read up on operating systems. Shame because you should know how the machine works to understand what is happening in your computer. I always recommend reading Operating Systems: Three Easy Pieces. https://pages.cs.wisc.edu/~remzi/OSTEP/

quotemstr•39m ago

The thing is, though, that PATH being a userspace concept is a contingent detail, an accident of history, not something inherent to the concept of an operating system. You can imagine a kernel that does path searches. Why not?

There's a difference between something being a certain way because it has to be that way in order to implement the semantics of the system (e.g. interrupt handlers being a privilege transition) and something being a certain way as a result of an arbitrary implementation choice.

OSes differ on these implementation choices all the time. For example,

* in Linux, the kernel is responsible for accepting a list of execve(2) argument-words and passing them to the exec-ed process with word boundaries intact. On Windows, the kernel passes a single string instead and programs chop that string up into argument words in userspace, in libc

* in Linux, the kernel provides a 32-bit system call API for 32-bit programs running on 64-bit kernels; on Windows, the kernel provides only a 64-bit system call API and it's a userspace program that does long-mode switching and system call argument translation

* on Windows, window handles (HWNDs, via user32.dll) in IPC message passing (ALPC, in ntoskrnl) are implemented in the kernel, whereas the corresponding concepts on most Linux systems are pure user-space constructs

And that's not even getting into weirder OSes! Someone familiar with operating systems in general can nevertheless be surprised at how a particular OS chooses to implement this or that feature.

amiga386•4m ago

* in Linux, the kernel is responsible for accepting a list of execve(2) argument-words

Yes it does, but the more surprising thing is (coming from AmigaOS with its dos.library function ReadArgs()) that the shell does this. The shell is also responsible for argument expansion - madness!

On AmigaOS, when you type "delete foo#? force", the shell passes the entire command line to the delete command. The delete command calls ReadArgs() with a template (FILE/M/A, ALL/S, QUlET/S, FORCE/S), and the standard OS function parses it into lists of files, flags, keyword arguments, etc. The "file" passed is "foo#?", and the command uses MatchFirst()/MatchNext() to do file pattern matching.

Every command (that uses ReadArgs() and didn't plump for "standard C" parsing) has the same behaviour: running the command with "?" gives you the template, which tells you how to use it. Args are parsed consistently across all programs.

Then you get "standard C", which because K&R and main(), ignores this standard Amiga parsing function and just does naive splits. Across multiple Amiga C compilers, quoting rules are inconsistent. Amiga C compilers have to produce an executable, and it knows it'll be called with a full command line, so the executable itself has to break that into words before it can call main(), and it's up to each compiler writer how they're going to do that. Urgh.

In unix-land, it's up to the shell to parse the command line, and pass only the words... hence why the shell naturally does all the filename globbing, and why you have gotchas like when these two commands are sometimes the same and sometimes they're not:

    find . -name foo*
    find . -name 'foo*'

Then we have Windows, which is like Amiga C programs - it's being passed a full command string and will have its C runtime parse it for main() to consume. There's a vague expectation that it'll do quoting "like COMMAND", which itself has very odd quoting rules. At least, most people are all using the same C compiler on Windows, so it's mostly only MSVCRT's implementation so it's mostly consistent.

blcknight•2h ago

Path globbing, pipes, redirection, job control (fg/bg), and all shell variables -- not just $PATH -- are all handled by the shell.

The kernel has no idea what the current process' environment $PATH is, and doesn't even parse any process environment variables at all.

thayne•2h ago

PATH isn't just handled by the shell though. Many (but not all!) of the exec* family of functions in libc respect PATH.

opello•1h ago

It seems too far to go to say that because a system library holds some implementation details that the responsibility doesn't lie with the program using them. There's all sorts of complex interdependent details that make those kind of boundary distinctions difficult in many operating systems.

matheusmoreira•14m ago

On Linux the main boundary between user space and kernel is quite clear: the system call layer. It is stable and well documented. System libraries like glibc are not part of the kernel, they are just components that can be replaced.

I wrote an article about it:

https://www.matheusmoreira.com/articles/linux-system-calls

kccqzy•32m ago

There is also paths.h usually located at /usr/include/paths.h. It contains the default PATH macro _PATH_DEFPATH.

matheusmoreira•19m ago

Those functions aren't the real system calls provided by Linux, they're just glibc wrappers with added functionality. Linux kernel execve has absolutely no concept of PATH, it just opens the file at the provided pathname. That's a good thing too, user space might want to customize that stuff.

drougge•2h ago

Using "#!sh" at the top of the file does work, but not predictably. It may execute sh in your current directory, which is what Linux does, but your shell may override that (zsh does if the first attempt fails). So it works, but not the way you want it to.

And I'm sure other kernels do other things too.

wpollock•2h ago

Why would strace cat be useful here? By the time cat runs, it was obviously already found.

It is basic knowledge that PATH is used by a command interpreter to locate the pathname of binaries. This is true for Window's cmd.exe as well. I never heard of a system where locating files for execution was performed by a kernel.

MathMonkeyMan•24m ago

In the [exec][1] family of POSIX functions, if the command path doesn't contain a slash, then it's looked up in the PATH.

> If the file argument contains a slash character, the file argument shall be used as the pathname for this file. Otherwise, the path prefix for this file is obtained by a search of the directories passed as the environment variable PATH [...]

[1]: https://pubs.opengroup.org/onlinepubs/009695399/functions/ex...

V99•18m ago

True... `strace bash -c cat` would give more the series of stat calls they're intending to see:

newfstatat(AT_FDCWD, ".", {st_mode=S_IFDIR|0700, st_size=4096, ...}, 0) = 0

newfstatat(AT_FDCWD, "/usr/local/sbin/cat", 0x7fffcec2f3b8, 0) = -1 ENOENT (No such file or directory)

newfstatat(AT_FDCWD, "/usr/local/bin/cat", 0x7fffcec2f3b8, 0) = -1 ENOENT (No such file or directory)

newfstatat(AT_FDCWD, "/usr/sbin/cat", 0x7fffcec2f3b8, 0) = -1 ENOENT (No such file or directory)

newfstatat(AT_FDCWD, "/usr/bin/cat", {st_mode=S_IFREG|0755, st_size=68536, ...}, 0) = 0

cryptonector•2h ago

It's real, it's just implemented by the shell -- same as all Unix-like operating systems. Heck, same as Windows.

SuperNinKenDo•1h ago

I was trying to understand what the lede was here, and it turns out the author assumed that PATH was something understood by the kernel, which is rather an odd assumption, but perhaps one that others make.

I did get one thing out of this though. I had honestly wondered for the longest time why we need to call env to get the same functionality as PATH in a shebang.

Ironically, thanks to either an article I read here (or on the crustacean site) recently, I already knew that the shebang is something which is parsed by the kernel, but had not put two and two together at all.

Much like the author. So goes to show the benefits of exploring and thinking about seemingly "obvious" concepts.

khrbtxyz•54m ago

Another bit of trivia about the shebang support in Linux is that is possible to build the kernel without it. https://github.com/torvalds/linux/blob/master/fs/Kconfig.bin...

  config BINFMT_SCRIPT
  tristate "Kernel support for scripts starting with #!"
  default y
  help
    Say Y here if you want to execute interpreted scripts starting with
    #! followed by the path to an interpreter.

dfedbeef•1h ago

It's real in GNU/Linux tho...

dfedbeef•1h ago

legitimately, if you're interested try writing a shell, your own libc, an elf loader even. It's fun! C is good and cool!

bawolff•1h ago

Doesn't that go without saying?

mzajc•47m ago

Fun fact: if you've ever had bash (or another shell) complain that a file doesn't exist, even though it's on $PATH, check if it's been cached by `hash`. If the file is moved elsewhere on $PATH and bash has the old path cached, you will get an ENOENT. The entire cache can be invalidated with `hash -r`.

JohnMakin•35m ago

you just solved a bug I couldnt explain like 6 years ago

noman-land•30m ago

Wtf. TIL about hash.

bobbylarrybobby•19m ago

Ah, so that's where sudo texhash -r comes from when installing a latex package!

vlovich123•14m ago

Is this an old behavior? I would think ENOENT would invalidate the cache entry at least.

Tsiklon•35m ago

Silly tangentially related question; I like to think of myself as fairly competent in the Linux and unix world.

In the unix systems of the past was it easier to hold a more complete understanding of the system and its components in your head?

0xbadcafebee•18m ago

The title is nonsense. PATH is the name of an environment variable (a Real Thing(TM)) which lists a set of directories to search for an executable. It is used by shells (including those running on Linux) to locate an executable when the full path to the executable is not supplied by the user. This is needed because the exece()/execve() kernel system call is unaware of things like environment variables so it will not have any idea how or where to execute a program 'cat' unless it is given the full path to 'cat', so the shell has to look it up (again if the user doesn't pass the full path).

It's the same on every POSIX system and the original UNIXes. It's been this way for at least 50 years. (edit 60 years, it's from Multics. https://en.wikipedia.org/wiki/PATH_(variable))

Kids today really need to learn the fundamentals of computer operating systems.

i140i485i765•8m ago

nobody talks about vfs path resolution?

i140i485i765•7m ago

Nobody talks about vfs path resolution here? There are too many layers in the whole process, even the path from strace can be resolved to another path.

megous•3m ago

Accessing environment variables from the kernel space isn't even all that easy, because the information lives in userspace in process VM. Here's how it's done for the purpose of showing it in `/proc/[pid/environ`:

https://elixir.bootlin.com/linux/v6.14.4/source/fs/proc/base...

Ask HN: Ask HN: How to fix issue and find the origin of bug in codebase?

The decline in cancer mortality is about much more than smoking

Newton-Schulz Iteration Algorithm

I Built Grammarly for Prompts

The federal minimum wage is officially a poverty wage in 2025

Hudson's Bay falls after 355 years: failure of retail icon

Trial begins for Australia woman accused of killing relos with mushroom lunch

How Can I Grow My AI Image Detection App (ImgDetect) Organically?

ESA Biomass P-Band SAR satellite launched

I Can Hear Thoughts

Dog found after 529 days in Australian wilderness

About the 10k applicants 1 hire post

I've largely replaced Google with ChatGPT for looking things up

Show HN: Smartshare.social – AI tool to automate Instagram content creation&post

New York state budget to include school cellphone ban

First, They Came for the Software Engineers

RedwoodSDK – The React Framework for Cloudfare

Show HN: Abra Actions – Plug-in AI assistant that executes front end functions

Lists of Unsolved Problems

LlamaFirewall

Building TEE Private Cloud Processing for AI Tools on WhatsApp

Concentrated Disadvantage

You Wouldn't Download a Hacker News

New Best Web Apps Generator (C# + Angular)

Show HN: Tariff Calculator for Amazon

Build.js.dev.build

The world of compiler backends

Ask HN: Are there any apps to track grocery prices in local stores?

All the Women I Met in Jail

Creating the Commodore 64: The Engineers' Story