> {0} initializer in C or C++ for unions no longer guarantees clearing of the whole union (except for static storage duration initialization), it just initializes the first union member to zero. If initialization of the whole union including padding bits is desirable, use {} (valid in C23 or C++) or use -fzero-init-padding-bits=unions option to restore old GCC behavior.
This is going to silently break so much existing code, especially union based type punning in C code. {0} used to guarantee full zeroing and {} did not, and step by step we've flipped the situation to the reverse. The only sensible thing, in terms of not breaking old code, would be to have both {0} and {} zero initialize the whole union.
I'm sure this change was discussed in depth on the mailing list, but it's absolutely mind boggling to me
VyseofArcadia•1h ago
I feel like once a language is standardized (or reaches 1.0), that's it. You're done. No more changes. You wanna make improvements? Try out some new ideas? Fine, do that in a new language.
I can deal with the footguns if they aren't cheekily mutating over the years. I feel like in C++ especially we barely have the time to come to terms with the unintended consequences of the previous language revision before the next one drops a whole new load of them on us.
ryao•55m ago
I suspect this change was motivated by standards conformance.
fuhsnn•44m ago
The wording of GCC maintainer was "the standard doesn't require it." when they informed Linux kernel mailing list.
> If the size of the new type is larger than the size of the last-written type, the contents of the excess bytes are unspecified (and may be a trap representation). Before C99 TC3 (DR 283) this behavior was undefined, but commonly implemented this way.
> When initializing a union, the initializer list must have only one member, which initializes the first member of the union unless a designated initializer is used(since C99).
→ = {0} initializes the first union variant, and bytes outside of that first variant are unspecified. Seems like GCC 15.1 follows the 26 year old standard correctly. (not sure how much has changed from C89 here)
hulitu•47m ago
It's careless development. Why think something in advance when you can fix it later. It works so well for Microsoft, Google and lately Apple. /s
The release cycle of a software speaks a lot about its quality. Move fast, break things has become the new development process.
pjmlp•12m ago
Programming languages are products, that is like saying you want to keep using vi 1.0.
Maybe C should have stop at K&R C from UNIX V6, at least that would have spared the world in having it being adopted outside UNIX.
ryao•53m ago
> This is going to silently break so much existing code
How much code actually uses unions this way?
> especially union based type punning in C code
I have never done type punning via the GNU C compiler extension in a way that would break because of this. I always assign a value to it and then get out the value from a new type. Do you know of any code that does things differently to be affected by this?
Calavar•47m ago
I would guess a lot. People aren't intimately familiar with the standard, and people are lazy when it comes to writing boilerplate like initialization code. And up until now, it just worked, so even a good test suite wouldn't catch it.
EDIT: I initially mentioned type punning for arithmetic, but this compiler change wouldn't affect that
ryao•44m ago
How would that be broken by this? The union will be zero initialized regardless because this change only affects situations where the union members are of different lengths, but for integer to float, the union members should always be the same length or bad things will happen.
Calavar•40m ago
I realized my mistake and I think I edited my comment a split second before you replied, but you're right. That particular type punning scenario wouldn't be affected by this change because 1) the members are the same size, so there's no padding bits 2) the specific union member is going to be initialized to the input parameter, not with the syntax sugar for aggregate zero initialization.
ryao•34m ago
Well, under your original version, I could see someone filling in bit fields in the float like the exponent and sign while leaving the mantissa zeroed, but given that the integer and float would be the same length, there is no section that would be left uninitialized by this change.
In order for this change to leave something uninitialized, you would need to have a member of the union after the first member that is longer than the first member. Code that does that and relies on {0} to zero the union seems incredibly rare to me.
ndiddy•45m ago
> How much code actually uses unions this way?
I see this change caused Mbed-TLS to start failing its test suite when compiled with GCC 15: https://github.com/Mbed-TLS/mbedtls/issues/9814 (kinda scary since it's a security library). Hopefully other projects with less rigorous test suites aren't using {0} in that way. The Github issue mentions that Clang tried a similar optimization a while ago and backed it out after user complaints, so maybe the same thing will happen with GCC.
ryao•42m ago
GCC’s developers have a strong insistence on standards conformance (minus situations where they explicitly choose to deviate, like type punning in unions) over the status quo. We already went through a much more severe shift with strict aliasing enforcement by GCC and they never changed course. I do not expect this to be any different.
kevin_thibedeau•10m ago
"{0}" is formalized in C99. It hasn't been an extension for some time.
ogoffart•49m ago
> This is going to silently break so much existing code
The code was already broken. It was an undefined behavior.
That's a problem with C and it's undefined behavior minefields.
ryao•48m ago
GCC has long been known to define undefined behavior in C unions. In particular, type punning in unions is undefined behavior under the C and C++ standards, but GCC (and Clang) define it.
mtklein•39m ago
I have always thought that punning through a union was legal in C but UB in C++, and that punning through incompatible pointer casting was UB in both.
I am basing this entirely on memory and the wikipedia article on type punning. I welcome extremely pedantic feedback.
ryao•32m ago
There has been plenty of misinformation spread on that. One of the GCC developers told me explicitly that type punning through a union was UB in C, but defined by GCC when I asked (after I had a bug report closed due to UB). I could find the bug report if I look for it, but I would rather not do the search.
grandempire•8m ago
When you have a big system many people rely on you generally try to look for ways to keep their code working - not look for the changes you’re contractually allowed to make.
mistrial9•46m ago
using UNION was always considered sketchy IMHO. This is trivia for security exploiters?
mtklein•44m ago
This was my instinct too, until I got this little tickle in the back of my head that maybe I remembered that Clang was already acting like this, so maybe it won't be so bad. Notice 32-bit wzr vs 64-bit xzr:
Ah, I can confirm what I see elsewhere in the thread, this is no longer true in Clang. That first clang was Apple Clang 17---who knows what version that actually is---and here is Clang 20:
$ /opt/homebrew/opt/llvm/bin/clang-20 -O1 -c union.c -o union.o && objdump -d union.o
union.o: file format mach-o arm64
Disassembly of section __TEXT,__text:
0000000000000000 <ltmp0>:
0: f900001f str xzr, [x0]
4: d65f03c0 ret
0000000000000008 <_create_d>:
8: f900001f str xzr, [x0]
c: d65f03c0 ret
Calavar•1h ago
This is going to silently break so much existing code, especially union based type punning in C code. {0} used to guarantee full zeroing and {} did not, and step by step we've flipped the situation to the reverse. The only sensible thing, in terms of not breaking old code, would be to have both {0} and {} zero initialize the whole union.
I'm sure this change was discussed in depth on the mailing list, but it's absolutely mind boggling to me
VyseofArcadia•1h ago
I can deal with the footguns if they aren't cheekily mutating over the years. I feel like in C++ especially we barely have the time to come to terms with the unintended consequences of the previous language revision before the next one drops a whole new load of them on us.
ryao•55m ago
fuhsnn•44m ago
https://lore.kernel.org/linux-toolchains/Z0hRrrNU3Q+ro2T7@tu...
seritools•47m ago
https://en.cppreference.com/w/c/language/union
> When initializing a union, the initializer list must have only one member, which initializes the first member of the union unless a designated initializer is used(since C99).
https://en.cppreference.com/w/c/language/struct_initializati...
→ = {0} initializes the first union variant, and bytes outside of that first variant are unspecified. Seems like GCC 15.1 follows the 26 year old standard correctly. (not sure how much has changed from C89 here)
hulitu•47m ago
The release cycle of a software speaks a lot about its quality. Move fast, break things has become the new development process.
pjmlp•12m ago
Maybe C should have stop at K&R C from UNIX V6, at least that would have spared the world in having it being adopted outside UNIX.
ryao•53m ago
How much code actually uses unions this way?
> especially union based type punning in C code
I have never done type punning via the GNU C compiler extension in a way that would break because of this. I always assign a value to it and then get out the value from a new type. Do you know of any code that does things differently to be affected by this?
Calavar•47m ago
EDIT: I initially mentioned type punning for arithmetic, but this compiler change wouldn't affect that
ryao•44m ago
Calavar•40m ago
ryao•34m ago
In order for this change to leave something uninitialized, you would need to have a member of the union after the first member that is longer than the first member. Code that does that and relies on {0} to zero the union seems incredibly rare to me.
ndiddy•45m ago
I see this change caused Mbed-TLS to start failing its test suite when compiled with GCC 15: https://github.com/Mbed-TLS/mbedtls/issues/9814 (kinda scary since it's a security library). Hopefully other projects with less rigorous test suites aren't using {0} in that way. The Github issue mentions that Clang tried a similar optimization a while ago and backed it out after user complaints, so maybe the same thing will happen with GCC.
ryao•42m ago
kevin_thibedeau•10m ago
ogoffart•49m ago
The code was already broken. It was an undefined behavior.
That's a problem with C and it's undefined behavior minefields.
ryao•48m ago
mtklein•39m ago
I am basing this entirely on memory and the wikipedia article on type punning. I welcome extremely pedantic feedback.
ryao•32m ago
grandempire•8m ago
mistrial9•46m ago
mtklein•44m ago
mtklein•35m ago