Well, same here. If you're using strtol on data that must be well-formed, and it's not, and you're at an earlier stage of startup like parsing the config file, go ahead and blow up. That's almost certainly better than plowing ahead with invalid data.
Like how if I hit ctrl-C at just the right moment during the build, the next time I build, cmake will segfault and I have to delete the build directory.
The fact that the strtol example in the manual is 50 lines long, of which most is error handling, speaks for itself. https://man7.org/linux/man-pages/man3/strtol.3.html#:~:text=...
That being said, I can't imagine many applications where crashing is a good solution.
> 50 lines long, of which most is error handling
That's a little exaggerated considering the example is an entire C program, including main. It's more like 14 lines, ignoring the '\0' check (which isn't necessary if you don't want to parse additional items), and even that includes whitespace and stderr logging.
I agree the biggest headache is reliance on errno and the nuanced--albeit conventional, consistent[1], and documented--semantics of only setting errno on error. Some POSIX APIs, at least those defined from scratch (e.g. pthreads, as opposed to incorporated common extensions) prefer returning the error code directly as you recommend. But this would technically only save you a single line, though it might save alot of confusion about semantics.
However, in this case I might keep returning the value directly and take an error pointer, similar to OpenBSD's strtonum. Otherwise you would need separate routines for char, short, int, long, and long long. Though there's still the issue of unsigned integers, and using _Generic it might be possible to at least hide all the type-specific variants behind a single interface.
And there's still the issue of needing to check for trailing garbage, or dropping the ability to use the interface inline with other parsing code. There's alot of dimensions to the problem. Parsing integers may be conceptually simple[2], but designing an interface that's easy to integrate into applications across a variety of contexts is much less simple.
[1] Consistent across libc interfaces, excepting those that may invoke syscalls internally, like printf, where errno may be incidentally modified even on success.
[2] I personally often prefer to just write a simple loop to manually parse integers. It's sometimes easier to integrate the desired error checking inline, or even elide it altogether (garbage-in, garbage-out).
From that man page:
> The implementation may also set errno to EINVAL in case no conversion was performed (no digits seen, and 0 returned).
There are error cases where the implementation may set errno to EINVAL. There's not even a guarantee. I did a quick test. errno is only set if you pass an invalid base, or if the string does contain a number but it is out of range. If you pass a string which doesn't even remotely look like a number, errno is not set. You have to check endptr.
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
const char *s = "this is not a number";
printf("strtol(\"%s\")\n", s);
errno = 0;
long a = strtol(s, NULL, 10);
printf("errno: %d; strerror: %s\n", errno, strerror(errno));
printf("result: %d\n", a);
return 0;
}
Output: strtol("this is not a number")
errno: 0; strerror: Success
result: 0
https://godbolt.org/z/TbEKss8zMThat's just rubbish design. In what is in my opinion the most common error case, namely that the string doesn't represent a number while you expect that it does, errno is not set.
#include <assert.h>
#include <ctype.h>
#include <limits.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdlib.h>
bool copy_str_to_long_long(const char* str, long long* ptr) {
while (isspace(*str))
++str;
if (*str == 0)
return false;
char* parsed = NULL;
long long value = strtoll(str, &parsed, 0);
while (isspace(*parsed))
++parsed;
if (*parsed != 0)
return false;
*ptr = value;
return true;
}
long long str_to_long_long(const char* str) {
long long result;
assert(copy_str_to_long_long(str, &result));
return result;
}
bool copy_str_to_long(const char* str, long* ptr) {
long long buffer;
if (!copy_str_to_long_long(str, &buffer) || buffer < LONG_MIN || buffer > LONG_MAX)
return false;
*ptr = buffer;
return true;
}
long str_to_long(const char* str) {
long result;
assert(copy_str_to_long(str, &result));
return result;
}
/*
...define copy_str_to_short, copy_str_to_int32, etc...
*/
Oh, you mean like:
int ret = sscanf (str, "%d", &value);
?
> Use of the numeric conversion specifiers produces Undefined Behavior for invalid input. See C11 7.21.6.2/10 ⟨https://port70.net/%7Ensz/c/c11/n1570.html#7.21.6.2p10⟩. This is a bug in the ISO C standard, and not an inherent design issue with the API. However, current implementations are not safe from that bug, so it is not recommended to use them. Instead, programs should use functions such as strtol(3) to parse numeric input. This manual page deprecates use of the numeric conversion specifiers until they are fixed by ISO C.
As the other poster points out, the bug is in the spec, and I'd be astonished if the library function itself actually misbehaves with any given input.
const char **cpp; char *p;
const char c = ’A’;
cpp = &p; // constraint violation
*cpp = &c; // valid
*p = 0; // valid
The first assignment is unsafe because it would allow the
following valid code to attempt to change the value of the
const object c.
There are proposals on the table for C2y to redefine various APIs, including strtol, strchr, memcpy, etc, to preserve const correctness. Implementations might make use of _Generic (there are some issues there, though), newly specified language features, or possibly use internal extensions not available in the language proper, to accomplish this.If you do this, and then find your thing crashes in production, then that means you didn't test it well enough.
Before someone mentions safety critical systems, consider the fact that the first apollo landing's computer crashed when it detected it was violating realtime bounds ("alarm 1201" means the OS detected CPU exhaustion + rebooted without clearing process state). At that point, it went into a reboot loop, and got close enough to the surface for Neil Armstrong to nudge the lander to a safe landing.
But there are many systems today that take a long time to restart so you can’t just abort if you have a chance to recover.
errno is only set on an error. It's not cleared on success. If errno was previously set, the function will always abort(). So you need to do something like:
int saved_errno = errno;
errno = 0;
....
errno = saved_errno;
return aInt;
Functions that clobber errno to zero make it impossible to call several functions and use a nonzero value of errno to conclude that one or more of them went wrong.
If you don't think that such code is a good idea (for instance because you believe that every function that can fail should be checked for its specific error), then you probably have nothing against functions which clobber errno to zero.
[1]: https://github.com/gcc-mirror/gcc/blob/master/libiberty/strt...
> If an overflow is detected, it calls abort()
An aside, but this doesn't detect overflows on Windows due to both long and int being 32 bits (you'd want strtoll for that).
https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_m...
You need to use "long long" or "intmax_t" (and matching strtoll/strtoimax)
rept0id-2•8h ago
If an overflow is detected, it calls abort(), you crash and get Aborted (core dumped).
This way, you prioritize memory safety over silently running in a wrong state, without needing to call strtol and check errors manually everywhere.