I don't think this is a good comparison. You're telling the compiler for Zig and Rust to pick something very modern to target, while I don't think V8 does the same. Optimizing JITs do actually know how to vectorize if the circumstances permit it.
Also, fwiw, most modern languages will do the same optimization you do with strings. Here's C++ for example: https://godbolt.org/z/TM5qdbTqh
Now, one rarely uses typed arrays in practice because they're pretty heavy to initialize so only worth it if one allocates a large typed array one once and reuses them a lot aster that, so again, fair enough! One other detail does annoy me a little bit: the article says the example JS code is pretty bloated, but I bet that a big part of that is that the JS JIT can't guarantee that 65536 equals the length of the two arrays so will likely insert a guard. But nobody would write a for loop that way anyway, they'd write it as i < x.length, for which the JIT does optimize at least one array check away. I admit that this is nitpicking though.
Do note that your linked godbolt code actually demonstrates one of the two sub-par examples though.
For complicated things, I haven't really understood the advantage compared to simply running a program at build time.
> I haven't really understood the advantage compared to simply running a program at build time.
Since both comptime and runtime code can be mixed, this gives you a lot of safety and control. The comptime in zig emulates the target architecture, this makes things like cross-compilation simply work. For program that generates code, you have to run that generator on the system that's compiling and the generator program itself has to be aware the target it's generating code for.
I do not understand your criticism of [N]. This gives compiler more information and catches errors. This is a good thing! Who could be annoyed by this: https://godbolt.org/z/EeadKhrE8 (of course, nowadays you could also define a descent span type in C)
The cross-compilation argument has some merit, but not enough to warrant the additional complexity IMHO. Compile-time computation will also have annoying limitations and makes programs more difficult to understand. I feel sorry for everybody who needs to maintain complex compile time code generation. Zig certainly does it better than C++ but still..
It only does sane thing in GCC, in other compilers it does nothing and since it's very underspec'd it's rarely used in any C projects. It's shame Dennis's fat pointers / slices proposal was not accepted.
> warrant the additional complexity IMHO
In zig case the comptime reduces complexity, because it is simply zig. It's used to implement generics, you can call zig code compile time, create and return types.
This old talk from andrew really hammers in how zig is evolution of C: https://www.youtube.com/watch?v=Gv2I7qTux7g
Also in Zig it does not reduce complexity but adds to it by creating an distinction between compile time and run-time. It is only lower complexity by comparing to other implementations of generic which are even worse.
update: clang is even a bit nicer https://godbolt.org/z/b99s1rMzh although both compile it to a constant if the other argument is known at compile time. In light of this, the Zig solution does not impress me much: https://godbolt.org/z/1dacacfzc
C only seems cleaner and simpler if you already know it well.
I don't doubt that compilers occasionally break language specs, but in that case Clang is correct, at least for C11 and later. From C11:
> An iteration statement whose controlling expression is not a constant expression, that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate.
Thus in C the trivial infinite loop for (;;); is supposed to actually compile to an infinite loop, as it should with Rust's less opaque loop {} -- however LLVM is built by people who don't always remember they're not writing a C++ compiler, so Rust ran into places where they're like "infinite loop please" and LLVM says "Aha, C++ says those never happen, optimising accordingly" but er... that's the wrong language.
while (i <= x) {
// ...
}
just needs a slight transformation to while (1) {
if (i > x)
break;
// ...
}
and C11's special permission does not apply any more since the controlling expression has become constant.Analyzes and optimizations in compiler backends often normalize those two loops to a common representation (e.g. control-flow graph) at some point, so whatever treatment that sees them differently must happen early on.
It is no accident that there is ongoing discussion that clang should get its own IR, just like it happens with the other frontends, instead of spewing LLVM IR directly into the next phase.
Worth mentioning that LLVM 12 added first-class support for infinite loops without guaranteed forward progress, allowing this to be fixed: https://github.com/rust-lang/rust/issues/28728
flohofwoe•7h ago
I love Zig too, but this just sounds wrong :)
For instance, C is clearly too sloppy in many corners, but Zig might (currently) swing the pendulum a bit too far into the opposite direction and require too much 'annotation noise', especially when it comes to explicit integer casting in math expressions (I wrote about that a bit here: https://floooh.github.io/2024/08/24/zig-and-emulators.html).
When it comes to performance: IME when Zig code is faster than similar C code then it is usually because of Zig's more aggressive LLVM optimization settings (e.g. Zig compiles with -march=native and does whole-program-optimization by default, since all Zig code in a project is compiled as a single compilation unit). Pretty much all 'tricks' like using unreachable as optimization hints are also possible in C, although sometimes only via non-standard language extensions.
C compilers (especially Clang) are also very aggressive about constant folding, and can reduce large swaths of constant-foldable code even with deep callstacks, so that in the end there often isn't much of a difference to Zig's comptime when it comes to codegen (the good thing about comptime is of course that it will not silently fall back to runtime code - and non-comptime code is still of course subject to the same constant-folding optimizations as in C - e.g. if a "pure" non-comptime function is called with constant args, the compiler will still replace the function call with its result).
TL;DR: if your C code runs slower than your Zig code, check your C compiler settings. After all, the optimization heavylifting all happens down in LLVM :)
Retro_Dev•7h ago
I don't love the noise of Zig, but I love the ability to clearly express my intent and the detail of my code in Zig. As for arithmetic, I agree that it is a bit too verbose at the moment. Hopefully some variant of https://github.com/ziglang/zig/issues/3806 will fix this.
I fully agree with your TL;DR there, but would emphasize that gaining the same optimizations is easier in Zig due to how builtins and unreachable are built into the language, rather than needing gcc and llvm intrinsics like __builtin_unreachable() - https://gcc.gnu.org/onlinedocs/gcc-4.5.0/gcc/Other-Builtins....
It's my dream that LLVM will improve to the point that we don't need further annotation to enable positive optimization transformations. At that point though, is there really a purpose to using a low level language?
flohofwoe•7h ago
One thing I was wondering, since most of Zig's builtins seem to map directly to LLVM features, if and how this will affect the future 'LLVM divorce'.
Retro_Dev•7h ago
It's an interesting question about how Zig will handle additional builtins and data representations. The current way I understand it is that there's an additional opt-in translation layer that converts unsupported/complicated IR to IR which the backend can handle. This is referred to as the compiler's "Legalize" stage. It should help to reduce this issue, and perhaps even make backends like https://github.com/xoreaxeaxeax/movfuscator possible :)
matu3ba•6h ago
That is quite a long way to go, since the following formal specs/models are missing to make LLVM + user config possible:
- hardware semantics, specifically around timing behavior and (if used) weak memory
- memory synchronization semantics for weak memory systems with ideas from “Relaxed Memory Concurrency Re-executed” and suggested model looking promising
- SIMD with specifically floating point NaN propagation
- pointer semantics, specifically in object code (initialization), se- and deserialization, construction, optimizations on pointers with arithmetic, tagging
- constant time code semantics, for example how to ensure data stays in L1, L2 cache and operations have constant time
- ABI semantics, since specifications are not formal
LLVM is also still struggling with full restrict support due to architecture decisions and C++ (now worked on since more than 5 years).
> At that point though, is there really a purpose to using a low level language?
Languages simplify/encode formal semantics of the (software) system (and system interaction), so the question is if the standalone language with tooling is better than state of art and for what use cases. On the tooling part with incremental compilation I definitely would say yes, because it provides a lot of vertical integration to simplify development.
The other long-term/research question is if and what code synthesis and formal method interaction for verification, debugging etc would look like for (what class of) hardware+software systems in the future.
eptcyka•3h ago
skywal_l•7h ago
saagarjha•7h ago
Zambyte•2h ago
messe•7h ago
flohofwoe•7h ago
hansvm•3h ago
The core problem is that you're changing the semantics of that integer as you change types, and if that happens automatically then the compiler can't protect you from typos, vibe-coded defects, or any of the other ways kids are generating almost-correct code nowadays. You can mitigate that with other coding patterns (like requiring type parameters in any potentially unsafe arithmetic helper functions and banning builtins which aren't wrapped that way), but under the swiss cheese model of error handling it still massively increases your risky surface area.
The issue is more obvious on the input side of that expression and with a different mask. E.g.:
Should `a` be lowered to a u4 for the computation, or `even_mask` promoted, or however we handle the internals have the result lowered sometimes to a u4? Arguably not. The mask is designed to extract even bit indices, but we're definitely going to only extract the low bits. The only safe instance of implicit conversion in this pattern is when you intend to only extract the low bits for some purpose.What if `even_mask` is instead a comptime_int? You still have the same issue. That was a poor use of comptime ints since now that implicit conversion will always happen, and you lost your compiler errors when you misuse that constant.
Back to your proposal of something that should always be safe: implicitly lowering `a & 15` to a u4. The danger is in using it outside its intended context, and given that we're working with primitive integers you'll likely have a lot of functions floating around capable of handling the result incorrectly, so you really want to at least use the _right_ integer type to have a little type safety for the problem.
For a concrete example, code like that (able to be implicitly lowered because of information obvious to the compiler) is often used in fixed-point libraries. The fixed-point library though does those sorts of operations with the express purpose of having zeroed bits in a wide type to be able to execute operations without loss of precision (the choice of what to do for the final coalescing of those operations when precision is lost being a meaningful design choice, but it's irrelevant right this second). If you're about to do any nontrivial arithmetic on the result of that masking, you don't want to accidentally put it in a helper function with a u4 argument, but with implicit lowering that's something that has no guardrails. It requires the programmer to make zero mistakes.
That example might seem a little contrived, and this isn't something you'll run into every day, but every nontrivial project I've worked on has had _something_ like that, where implicit narrowing is extremely dangerous and also extremely easy to accidentally do.
What about the verbosity? IMO the point of verbosity is to draw your attention to code that you should be paying attention to. If you're in a module where implicit casting would be totally fine, then make a local helper function with a short name to do the thing you want. Having an unsafe thing be noisy by default feels about right though.
throwawaymaths•48m ago
johnisgood•7h ago
I agree with everything flohofwoe said, especially this: "C is clearly too sloppy in many corners, but Zig might (currently) swing the pendulum a bit too far into the opposite direction and require too much 'annotation noise', especially when it comes to explicit integer casting in math expressions ".
Seems like I will keep using Odin and give C3 a try (still have yet to!).
Edit: I quite dislike that the downvote is used for "I disagree, I love Zig". sighs. Look at any Zig projects, it is full of annotation noise. I would not want to work with a language like that. You might, that is cool. Good for you.
codethief•6h ago
"." = the "namespace" (in this case an enum) is implied, i.e. the compiler can derive it from the function signature / type.
"@" = a language built-in.
johnisgood•6h ago
pjmlp•1h ago
While Zig fixes some of these issues, the amount of @ feels like being back in Objective-C land and yeah too many uses of dot and starts.
Then again, I am one of those that actually enjoys using C++, despite all its warts and the ways of WG21 nowadays.
I also dislike the approach with source code only libraries and how importing them feels like being back in JavaScript CommonJS land.
Odin and C3 look interesting, the issue is always what is going to be the killer project, that makes reaching for those alternatives unavoidable.
I might not be a language XYZ cheerleeder, but occasionally do have to just get my hands dirty and do the needfull for an happy customer, regardlees of my point of view on XYZ.
throwawaymaths•47m ago
as for the dots, if you use zig quite a bit you'll see that dot usage is incredibly consistent, and not having the dots will feel wrong, not just in an "I'm used to it sense/stockholm syndrome" but you will feel for example that C is wrong for not having them.
for example, the use of dot to signify "anonymous" for a struct literal. why doesn't C have this? the compiler must make a "contentious" choice if something is a block or a literal. by contentious i mean the compiler knows what its doing but a quick edit might easily make you do something unexpected
knighthack•6h ago
What's good for the goose should be good for the gander.
nurbl•6h ago
ummonk•5h ago
Zambyte•2h ago
kbolino•1h ago
Zambyte•2h ago