Calling a separate (non-cancellable) thread to perform the lookup sounds a like viable solution...
Windows 8 and above also have their own asynchronous DNS API on NON-POSIX land.
In general, what you really want is for the API call to be nonblocking so you’re not forced to burn a thread.
When someone kills off the curl context surely you simply set a suicide flag on the thread and wake it up so it can be joined.
I don’t see why glibc would have to do that inside a call to getaddrinfo. can’t it do that once at library initialization? If it has to react to changes to that file while a process is running, couldn’t it have a separate thread for polling that file for changes, or use inotify for a separate thread to be called when it changes? Swapping in the new config atomically might be problematic, but I would think that is solvable.
Even ignoring the issue mentioned it seems wasteful to open, parse, and close that file repeatedly.
Does anyone have any idea what things they're referring to here?
A better approach would have been to mimic how kernels internally handle signals received during syscalls. Receiving a signal is supposed to cancel the syscall. But from the kernel’s perspective, a syscall implementation is just some code. It can call other functions, acquire locks, wait for conditions, and do anything else you would expect code to do. All of that needs to be cleanly cancelled and unwound to avoid breaking the rest of the system.
So it works like this: when a signal is sent to a thread, a persistent “interrupted” flag is set for that thread. Like with pthread_cancel, this doesn’t immediately interrupt the thread, but only has an effect once the thread calls one of a specific set of functions. For pthread_cancel, that set consists of a bunch of syscalls and other “cancellation points”. For kernel-internal code, it consists of most functions that wait for a condition. The difference is in what happens afterwards. In pthread_cancel’s case, the thread is immediately aborted with only designated cleanups running. In the kernel, the condition-waiting function simply returns an error code. The caller is expected to handle this like any other error code, i.e. by performing any necessary cleanup and then returning the same error code itself. This continues until the entire chain of calls has been unwound. Classic C manual error handling. It’s nothing special, but because interruption works the same way as regular error handling, it‘s more likely to “just work”. Once everything is unwound, the “interrupted” flag is cleared and the original signal can be handled.
(The error code for interruption is usually EINTR, but don’t confuse this with EINTR handling in userspace, which is a mess. The difference is because userspace generally doesn’t want to abort operations upon receiving EINTR, and because from userspace’s perspective there’s no persistent flag.)
pthread_cancel could have been designed the same way: cancellation points return an error code rather than forcibly unwinding. Admittedly, this system might not work quite as well in userspace as it does in kernels. Kernel code already needs to be scrupulous about proper error handling, whereas userspace code often just aborts if a syscall fails. Still, the system would work fine for well-written userspace code, which is more than can be said for pthread_cancel.
Yes I know in Linux you can set the timeout in a config file.
But really the dns setting should be configurable by the calling code. Some code requires fast lookups and doesn’t mind failing which, while others won’t mind waiting longer. It’s not a one size fits all thing.
The solution is to make it a properly non-blocking api.
Or just use some standalone DNS resolve code or library (which basically replicates getaddrinfo but supports this in an async way)?
See also here the discussion: https://github.com/crystal-lang/crystal/issues/13619
pizlonator•1h ago
I've always thought APIs like `pthread_cancel` are too nasty to use. Glad to see well documented evidence of my crank opinion
pengaru•39m ago
Imagine cpu-bound worker threads that do nothing but consume work via condition variables and spend long periods of time in hot compute-only loops working on said work... Instead of adding a conditional in the compute you're probably not interested in slowing down at all, you turn on async cancellation and pthread_cancel() the workers when you need to interrupt what's going on.
But it's worth noting pthread_cancel() is also rarely supported anywhere outside first-class pthreads-capable systems like modern linux. So if you have any intention of running elsewhere, forget about it. Thread cancellation support in general is actually somewhat rare IME.