i = i++ 3.16 undefined behavior: Behavior, upon use of a nonportable or erroneous program construct, of erroneous data, or of indeterminately valued objects, for which this International Standard imposes no requirements. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message).
Is it just me or did compiler writers apply overly legalistic interpretation to the "no requirements" part in this paragraph? The intent here is extremely clear, that undefined behavior means you're doing something not intended or specified by the language, but that the consequence of this should be somewhat bounded or as expected for the target machine. This is closer to our old school understanding of UB.By 'bounded', this obviously ignores the security consequences of e.g. buffer overflows, but just because UB can be exploited doesn't mean it's appropriate for e.g. the compiler to exploit it too, that clearly violates the intent of this paragraph.
Aren't "unpredictable results" and "no requirements" contrary to the idea that the behavior would be "somewhat bounded"?
I don't think you could sincerely argue that this definition intends to allow the compiler to totally rewrite your code because of one guaranteed UB detected on line 5, just that it would be good to print a diagnostic if it can be detected, and if not to do what's "characteristic of the environment". Does that make sense?
Far from being just "C with classes", modern C++ is very different than C. The language is huge and complex, for sure, but nobody is forced to use all of it.
No HN comment can possibly cover all the use cases of C++ but in general, unless you have a very good reason not to:
- eschewing boomer loops in favor of ranges
- using RAII with smart pointers
- move semantics
- using STL containers instead of raw arrays
- borrowing using spans and string views
These things go a long way towards, shall we say, "safe-ish" code without UB. It is not memory-safe enforced at the language level, like Rust, but the upshot is you never need to deal with the Rust community :^)
In the end, everything comes down to culture war.
>After all, C/C++ is not a memory safe language.
As far as stdlib usage is concerned: that's just your opinion. The stdlib has a lot of footguns and terrible design decisions too, e.g. std::vector pulling in 20k lines of code into each compilation unit is simply bizarre.
Also:
- borrowing using spans and string views
Those are just as unsafe as raw pointers. It's not really "borrowing" when the referenced data can disappear while the "borrow" is active.
The article suggests using LLMs to identify and fix UB. However as per the above, I think the issue is that we need more expert humans.
LLM generated code will eventually contain UB.
EDIT: added "eventually"
(I hope casting fear is not UB)
dmitrygr•19m ago
you might try to argue that uint8_t is not necessarily char, and while it is true that implementations of C can exist where CHAR_BIT > 8, but those do not have uint8_t defined (as per spec), so if you have uint8_t, then it is "unsigned char", which makes this cast perfectly safe and defined as far as i can tell. Of course CHAR_BIT is required to be >= 8, so if it is not >8, it is exactly 8. (In any case, whether uint8_t is literally a typedef of unsigned char is implementation-defined and not actually relevant to whether the cast itself is valid -- it is)