The potential bugs listed would be prevented by, e.g. "x--" won't compile without explicitly supplying a case for x==0 OR by using some more verbose methods like "decrement_with_wrap".
The trade-off is lack of C-like concise code, but more safe and explicit.
Except that's not quite what unsigned types do. They are not (just) numbers that will always be >= 0, but numbers where the value of `1 - 2` is > 1 and depends on the type. This is not an accident but how these types are intended to behave because what they express is that you want modular arithmetic, not non-negative integers.
> e.g. "x--" won't compile without explicitly supplying a case for x==0
If you want non-negative types (which, again, is not what unsigned types are for) you also run into difficulties with `x - y`. It's not so simple.
There are many useful constraints that you might think it's "better to have a type that reflects that" - what about variables that can only ever be even? - but it's often easier said than done.
I agree they're a bit more error-prone in practice, but I suspect huge part of that is because people are so used to signed numbers. And, legitimately, zero is a more commonly-encountered value... but that can push errors to occur sooner, which is generally a desirable thing.
Alas, (2S * signed_index) / 2S will similarly result in surprises the moment the signed_index hits half the signed-int max. There's no free lunch when trying to cheat the integer ranges.
TBH I've had very little struggle with this at all. As long as you keep your values and types separate, the unsigned type that you got a number from originally feeds just fine into the unsigned type that you send it to next. Needing casting then becomes a very clear sign that you're mixing sources and there be dragons, back up and fix the types or stop using the wrong variable. It's a low-cost early bug detector.
Implicitly casting between integer types though... yeah, that's an absolute freaking nightmare.
And - yes, there are very important use cases for unsigned/modulo-2n/wraparound values. But sizes of data structures are generally _not_ one of those use cases. The fact that the size is non-negative does not mean that the type should be unsigned. You should still be able to, say, subtract sizes and get a signed value which may be negative.
I don’t really get this claim. Indexing should just look up the element corresponding to the value provided. It’s easy to come up with semantics that are intuitive and sound, even if signed integers or ones smaller than size_t are used.
This is especially inconvenient in C, where there exist extremely dangerous legacy implicit casts between signed integers and unsigned integers, which have a great probability of generating incorrect values.
Because the index is typically a signed integer, comparing it with an unsigned limit without using explicit casts is likely to cause bugs. Using explicit casts of smaller unsigned integers towards bigger signed integers results in correct code, but it is cumbersome.
These problems are avoided as said in TFA, by making "sizeof" and the like to have 64-bit signed integer values, instead of unsigned values.
Well chosen implicit conversions are good for a programming language, by reducing unnecessary verbosity, but the implicit integer conversions of C are just wrong and they are by far the worst mistake of C much worse than any other C feature.
Other C features are criticized because they may be misused by inexperienced or careless programmers, but most of the implicit integer conversions are just incorrect. There is no way of using them correctly. Only the conversions from a smaller signed integer to a bigger signed integer are correct.
Mixed signedness conversions have always been wrong and the conversions between unsigned integers have been made wrong by the change in the C standard that has decided that the unsigned integers are integer residues modulo 2^N and they are not non-negative integers.
For modular integers, the only correct conversions are from bigger numbers to smaller numbers, i.e. the opposite of the implicit conversions of C. The implicit conversions of C unsigned numbers would have been correct for non-negative integers, but in the current C standard there are no such numbers.
The current C standard is inconsistent, because the meaning of sizeof is of a non-negative integer and this is also true for the conversions between unsigned numbers, but all the arithmetic operations with unsigned numbers are defined to be operations with integer residues, not operations with non-negative numbers.
The hardware of most processors implements at least 3 kinds of arithmetic operations: operations with signed integers, operations with non-negative integers and operations with integer residues.
Any decent programming language should define distinct types for these kinds of numbers, otherwise the only way to use completely the processor hardware is to use assembly language. Because C does not do this, you have to use at least inline assembly, if not separate assembly source files, for implementing operations with big numbers.
Sure, it's possible to write bugs in C. And if you really want to, you can disable the compiler warnings which flag tautologous comparisons and mixed-sign comparisons (a common reason for doing this is to avoid spurious warnings in generic-type code).
But, uhh, "people can deliberately write bugs" has got to be the weakest justification I've ever seen for changing a language feature -- especially one as fundamental as "sizes of objects can't be negative".
Fix the language. Don't hack around it by using the wrong type.
kevin_thibedeau•1h ago
uecker•1h ago
throwaway894345•17m ago
kevin_thibedeau•11m ago
throwaway894345•5m ago
uecker•6m ago
einpoklum•43m ago
There are a few of those, but that is the niche case. Certainly when we're talking about 64-bit size types. And if you want to cater to smaller size types, then just just template over the size type. Or, OK, some other trick if it's C rather than C++.
pixelesque•6m ago
Losing half the range to make them signed when you only care about positive values 95% of the time (and in the rare case when you do any modulo on top of them you can cast, or write wrappers for that), is just a bad trade-off.
Yes, you've still then only doubled the range to 2^32, and you'll still hit it at some point, but that extra byte can make a lot of difference from a memory/cache efficiency standpoint without jumping to 64-bit.
So very often uint32_t is a very good sweet spot for size: int32_t is sometimes too small, and (u)int64_t is generally not needed and too wasteful.
pron•16m ago