Low-Level Optimization with Zig

https://alloc.dev/2025/06/07/zig_optimization

307•Retro_Dev•8mo ago

Comments

flohofwoe•8mo ago

> I love Zig for it's verbosity.

I love Zig too, but this just sounds wrong :)

For instance, C is clearly too sloppy in many corners, but Zig might (currently) swing the pendulum a bit too far into the opposite direction and require too much 'annotation noise', especially when it comes to explicit integer casting in math expressions (I wrote about that a bit here: https://floooh.github.io/2024/08/24/zig-and-emulators.html).

When it comes to performance: IME when Zig code is faster than similar C code then it is usually because of Zig's more aggressive LLVM optimization settings (e.g. Zig compiles with -march=native and does whole-program-optimization by default, since all Zig code in a project is compiled as a single compilation unit). Pretty much all 'tricks' like using unreachable as optimization hints are also possible in C, although sometimes only via non-standard language extensions.

C compilers (especially Clang) are also very aggressive about constant folding, and can reduce large swaths of constant-foldable code even with deep callstacks, so that in the end there often isn't much of a difference to Zig's comptime when it comes to codegen (the good thing about comptime is of course that it will not silently fall back to runtime code - and non-comptime code is still of course subject to the same constant-folding optimizations as in C - e.g. if a "pure" non-comptime function is called with constant args, the compiler will still replace the function call with its result).

TL;DR: if your C code runs slower than your Zig code, check your C compiler settings. After all, the optimization heavylifting all happens down in LLVM :)

Retro_Dev•8mo ago

Ahh perhaps I need to clarify:

I don't love the noise of Zig, but I love the ability to clearly express my intent and the detail of my code in Zig. As for arithmetic, I agree that it is a bit too verbose at the moment. Hopefully some variant of https://github.com/ziglang/zig/issues/3806 will fix this.

I fully agree with your TL;DR there, but would emphasize that gaining the same optimizations is easier in Zig due to how builtins and unreachable are built into the language, rather than needing gcc and llvm intrinsics like __builtin_unreachable() - https://gcc.gnu.org/onlinedocs/gcc-4.5.0/gcc/Other-Builtins....

It's my dream that LLVM will improve to the point that we don't need further annotation to enable positive optimization transformations. At that point though, is there really a purpose to using a low level language?

flohofwoe•8mo ago

Yeah indeed. Having access to all those 'low-level tweaks' without having to deal with non-standard language extensions which are different in each C compiler (if supported at all) is definitely a good reason to use Zig.

One thing I was wondering, since most of Zig's builtins seem to map directly to LLVM features, if and how this will affect the future 'LLVM divorce'.

Retro_Dev•8mo ago

Good question! The TL;DR as I understand it is that it won't matter too much. For example, the self-hosted x86_64 backend (which is coincidentally becoming default for debugging on linux right now - https://github.com/ziglang/zig/pull/24072) has full support for most (all?) builtins. I don't think that we need to worry about that.

It's an interesting question about how Zig will handle additional builtins and data representations. The current way I understand it is that there's an additional opt-in translation layer that converts unsupported/complicated IR to IR which the backend can handle. This is referred to as the compiler's "Legalize" stage. It should help to reduce this issue, and perhaps even make backends like https://github.com/xoreaxeaxeax/movfuscator possible :)

matu3ba•8mo ago

> LLVM will improve to the point that we don't need further annotation to enable positive optimization transformations

That is quite a long way to go, since the following formal specs/models are missing to make LLVM + user config possible:

- hardware semantics, specifically around timing behavior and (if used) weak memory

- memory synchronization semantics for weak memory systems with ideas from “Relaxed Memory Concurrency Re-executed” and suggested model looking promising

- SIMD with specifically floating point NaN propagation

- pointer semantics, specifically in object code (initialization), se- and deserialization, construction, optimizations on pointers with arithmetic, tagging

- constant time code semantics, for example how to ensure data stays in L1, L2 cache and operations have constant time

- ABI semantics, since specifications are not formal

LLVM is also still struggling with full restrict support due to architecture decisions and C++ (now worked on since more than 5 years).

> At that point though, is there really a purpose to using a low level language?

Languages simplify/encode formal semantics of the (software) system (and system interaction), so the question is if the standalone language with tooling is better than state of art and for what use cases. On the tooling part with incremental compilation I definitely would say yes, because it provides a lot of vertical integration to simplify development.

The other long-term/research question is if and what code synthesis and formal method interaction for verification, debugging etc would look like for (what class of) hardware+software systems in the future.

eptcyka•8mo ago

For constant time code, it doesn’t matter too much if data spills out of a cache, constant time issues arise from compilers introducing early exits which leaves crypto open to timing attacks.

matu3ba•8mo ago

Thanks for the info. Do you have a good overview on what other hardware properties or issues are relevant?

skywal_l•8mo ago

Maybe with the new x86 backend we might see some performance differences between C and Zig that could definitely be attributed solely to the Zig project.

saagarjha•8mo ago

I would be (pleasantly) surprised if Zig could beat LLVM's codegen.

Zambyte•8mo ago

So would the Zig team. AFAIK, they don't plan to (and have said this in interviews). The plan is for super fast compilation and incremental compilation. I think the homegrown backend is mainly for debug builds.

Cloudef•8mo ago

The backends do already have some simple optimizations. Of course focus is debug builds and speed, but long term goal is for them to be competitive as well.

messe•8mo ago

With regard to the casting example, you could always wrap the cast in a function:

    fn signExtendCast(comptime T: type, x: anytype) T {
        const ST = std.meta.Int(.signed, @bitSizeOf(T));
        const SX = std.meta.Int(.signed, @bitSizeOf(@TypeOf(x)));
        return @bitCast(@as(ST, @as(SX, @bitCast(x))));
    }

    export fn addi8(addr: u16, offset: u8) u16 {
        return addr +% signExtendCast(u16, offset);
    }

This compiles to the same assembly, is reusable, and makes the intent clear.

flohofwoe•8mo ago

Yes, that's a good solution for this 'extreme' example. But in other cases I think the compiler should make better use of the available information to reduce 'redundant casting' when narrowing (like the fact that the result of `a & 15` is guaranteed to fit into an u4 etc...). But I know that the Zig team is aware of those issues, so I'm hopeful that this stuff will improve :)

hansvm•8mo ago

This is something I used to agree with, but implicit narrowing is dangerous, enough so that I'd rather be more explicit most of the time nowadays.

The core problem is that you're changing the semantics of that integer as you change types, and if that happens automatically then the compiler can't protect you from typos, vibe-coded defects, or any of the other ways kids are generating almost-correct code nowadays. You can mitigate that with other coding patterns (like requiring type parameters in any potentially unsafe arithmetic helper functions and banning builtins which aren't wrapped that way), but under the swiss cheese model of error handling it still massively increases your risky surface area.

The issue is more obvious on the input side of that expression and with a different mask. E.g.:

  const a: u64 = 42314;
  const even_mask: u4 = 0b0101;
  a & even_mask;

Should `a` be lowered to a u4 for the computation, or `even_mask` promoted, or however we handle the internals have the result lowered sometimes to a u4? Arguably not. The mask is designed to extract even bit indices, but we're definitely going to only extract the low bits. The only safe instance of implicit conversion in this pattern is when you intend to only extract the low bits for some purpose.

What if `even_mask` is instead a comptime_int? You still have the same issue. That was a poor use of comptime ints since now that implicit conversion will always happen, and you lost your compiler errors when you misuse that constant.

Back to your proposal of something that should always be safe: implicitly lowering `a & 15` to a u4. The danger is in using it outside its intended context, and given that we're working with primitive integers you'll likely have a lot of functions floating around capable of handling the result incorrectly, so you really want to at least use the _right_ integer type to have a little type safety for the problem.

For a concrete example, code like that (able to be implicitly lowered because of information obvious to the compiler) is often used in fixed-point libraries. The fixed-point library though does those sorts of operations with the express purpose of having zeroed bits in a wide type to be able to execute operations without loss of precision (the choice of what to do for the final coalescing of those operations when precision is lost being a meaningful design choice, but it's irrelevant right this second). If you're about to do any nontrivial arithmetic on the result of that masking, you don't want to accidentally put it in a helper function with a u4 argument, but with implicit lowering that's something that has no guardrails. It requires the programmer to make zero mistakes.

That example might seem a little contrived, and this isn't something you'll run into every day, but every nontrivial project I've worked on has had _something_ like that, where implicit narrowing is extremely dangerous and also extremely easy to accidentally do.

What about the verbosity? IMO the point of verbosity is to draw your attention to code that you should be paying attention to. If you're in a module where implicit casting would be totally fine, then make a local helper function with a short name to do the thing you want. Having an unsafe thing be noisy by default feels about right though.

throwawaymaths•8mo ago

you could give the wrapper function a funny name like @"sign-cast" to force the eye to be drawn to it.

johnisgood•8mo ago

Yeah but what is up with all that "." and "@"? Yes, I know what they are used for, but it is noise for me (i.e. "annotation noise"). This is why I do not use Zig. Zig is more like a lighter C++, not a C replacement, IMO.

I agree with everything flohofwoe said, especially this: "C is clearly too sloppy in many corners, but Zig might (currently) swing the pendulum a bit too far into the opposite direction and require too much 'annotation noise', especially when it comes to explicit integer casting in math expressions ".

Seems like I will keep using Odin and give C3 a try (still have yet to!).

Edit: I quite dislike that the downvote is used for "I disagree, I love Zig". sighs. Look at any Zig projects, it is full of annotation noise. I would not want to work with a language like that. You might, that is cool. Good for you.

codethief•8mo ago

> Yeah but what is up with all that "." and "@"

"." = the "namespace" (in this case an enum) is implied, i.e. the compiler can derive it from the function signature / type.

"@" = a language built-in.

johnisgood•8mo ago

I know what these are, but they are noise to me.

pyrolistical•8mo ago

It is waaaaaaay less noisy than c++

C syntax may look simpler but reading zig is more comfy bc there is less to think about than c due to explicit allocator.

There is no hidden magic with zig. Only ugly parts. With c/c++ you can hide so much complexity in a dangerous way

johnisgood•8mo ago

FWIW: I hate C++, too.

Simran-B•8mo ago

It's not annotation noise however, it's syntax noise.

johnisgood•8mo ago

Thanks for the correction. Is it really not "annotation"? What makes the difference?

Simran-B•8mo ago

You're not providing extra information to the compiler, clarifying the intent, but merely follow the requirements of the language when writing . to infer the type or @ to use a built-in function.

johnisgood•8mo ago

Thank you. My previous comment got down-voted despite it being a legitimate question, weird times.

kprotty•8mo ago

C++'s `::` vs Zig's `.`

C++'s `__builtin_` (or arguably `_`/`__`) vs Zig's `@`

johnisgood•8mo ago

I hate C++, too.

pjmlp•8mo ago

Despite all bashes that I do at C, I would be happy if during the last 40 years we had gotten at least fat pointers, official string and array vocabulary types (instead of everyone getting their own like SDS and glib), namespaces instead of mylib_something, proper enums (like enum class in C++, enums in C# and so forth), fixing the pointer decay from array to &array[0], less UB.

While Zig fixes some of these issues, the amount of @ feels like being back in Objective-C land and yeah too many uses of dot and starts.

Then again, I am one of those that actually enjoys using C++, despite all its warts and the ways of WG21 nowadays.

I also dislike the approach with source code only libraries and how importing them feels like being back in JavaScript CommonJS land.

Odin and C3 look interesting, the issue is always what is going to be the killer project, that makes reaching for those alternatives unavoidable.

I might not be a language XYZ cheerleeder, but occasionally do have to just get my hands dirty and do the needfull for an happy customer, regardlees of my point of view on XYZ.

throwawaymaths•8mo ago

the line noise is really ~only there for dangerous stuff, where slowing down a little bit (both reading and writing) is probably a good idea.

as for the dots, if you use zig quite a bit you'll see that dot usage is incredibly consistent, and not having the dots will feel wrong, not just in an "I'm used to it sense/stockholm syndrome" but you will feel for example that C is wrong for not having them.

for example, the use of dot to signify "anonymous" for a struct literal. why doesn't C have this? the compiler must make a "contentious" choice if something is a block or a literal. by contentious i mean the compiler knows what its doing but a quick edit might easily make you do something unexpected

elcritch•8mo ago

You might try out Nim. It has a low annotation noise level. The python like syntax feels odd for a systems language at first but since it’s statically typed it works well. The simpler syntax seems to work very well with LLMs too.

Basically it’s a better C/C++ for me and sits between C and C++ in complexity.

I tried Zig for a while years ago but found the casts and other annotations frustrating. And at the time the language was pretty unstable in how it applied those rules. Plus I’ve never found a use for custom allocators, even on embedded.

knighthack•8mo ago

I'm not sure why allowances are made for Zig's verbosity, but not Go's.

What's good for the goose should be good for the gander.

nurbl•8mo ago

I think a better word may be "explicitness". Zig is sometimes verbose because you have to spell things out. Can't say much about Go, but it seems it has more going on under the hood.

ummonk•8mo ago

Zig's verbosity goes hand in hand with a strong type system and a closeness to the hardware. You don't get any such benefits from Go's verbosity.

Zambyte•8mo ago

FWIW Zig has error handling that is nearly semantically identical to Go (errors as return values, the big semantic difference being tagged unions instead of multiple return values for errors), but wraps the `if err != nil { return err}` pattern in a single `try` keyword. That's the verbosity that I see people usually complaining about in Go, and Zig addresses it.

kbolino•8mo ago

The way Zig addresses it also discards all of the runtime variability too. In Go, an error can say something like

    unmarshaling struct type Foo: in field Bar int: failed to parse value "abc" as integer

Whereas in Zig, an error can only say something that's known at compile time, like IntParse, and you will have to use another mechanism (e.g. logging) to actually trace the error.

metaltyphoon•8mo ago

Yep. Errors carry no context whatsoever and you have no idea where they came from.

Zambyte•8mo ago

You can trace error returns with the builtin @errorReturnTrace function.

https://ziglang.org/documentation/0.14.1/#errorReturnTrace

https://ziglang.org/documentation/0.14.1/#Error-Return-Trace...

gf000•8mo ago

I mean, this is definitely not a strong suit of go either. In Zig you can just pass in a pointer though to add additional context.

Zambyte•8mo ago

Regarding the explicit integer casting, it seems like there is some cleanup that will be coming soon: https://ziggit.dev/t/short-math-notation-casting-clarity-of-...

titzer•8mo ago

Zig has some interesting ideas, and I thought the article was going to be more on the low-level optimizations, but it turned out to be "comptime and whole program compilation are great". And I agree. Virgil has had the full language available at compile time, plus whole program compilation since 2006. But Virgil doesn't target LLVM, so speed comparisons end up being a comparison between two compiler backends.

Virgil leans heavily into the reachability and specialization optimizations that are made possible by the compilation model. For example it will aggressively devirtualize method calls, remove unreachable fields/objects, constant-promote through fields and heap objects, and completely monomorphize polymorphic code.

int_19h•8mo ago

I rather suspect that the pendulum will swing rather strongly towards more verbose and explicit languages in general in the upcoming years solely because it makes things easier for AI.

(Note that this is orthogonal to whether and to what extent use of AI for coding is a good idea. Even if you believe that it's not, the fact is that many devs believe otherwise, and so languages will strive to accommodate them.)

KingOfCoders•8mo ago

I do love the allocator model of Zig, I would wish I could use something like an request allocator in Go instead of GC.

usrnm•8mo ago

Custom allocators and arenas are possible in go and even do exist, but they ara just very unergonomic and hard to use properly. The language itself lacks any way to express and enforce ownership rules, you just end up writing C with a slightly different syntax and hoping for the best. Even C++ is much safer than go without GC

KingOfCoders•8mo ago

They are not integrated in all libraries, so for me they don't exist.

saagarjha•8mo ago

> As an example, consider the following JavaScript code…The generated bytecode for this JavaScript (under V8) is pretty bloated.

I don't think this is a good comparison. You're telling the compiler for Zig and Rust to pick something very modern to target, while I don't think V8 does the same. Optimizing JITs do actually know how to vectorize if the circumstances permit it.

Also, fwiw, most modern languages will do the same optimization you do with strings. Here's C++ for example: https://godbolt.org/z/TM5qdbTqh

Retro_Dev•8mo ago

You can change the `target` in those two linked godbolt examples for Rust and Zig to an older CPU. I'm sorry I didn't think about the limitations of the JS target for that example. As for your link, It's a good example of what clang can do for C++ - although I think that the generated assembly may be sub-par, even if you factor in zig compiling for a specific CPU here. I would be very interested to see a C++ port of https://github.com/RetroDev256/comptime_suffix_automaton though. It is a use of comptime that can't be cleanly guessed by a C++ compiler.

saagarjha•8mo ago

I just skimmed your code but I think C++ can probably constexpr its way through. I understand that's a little unfair though because C++ is one of the only other languages with a serious focus on compile-time evaluation.

vanderZwan•8mo ago

In general it's a bit of an apples to fruit salad comparison, albeit one that is appropriate to highlight the different use-cases of JS and Zig. The Zig example uses an array with a known type of fixed size, the JS code is "generic" at run time (x and y can be any object). Which, fair enough, is something you'd have to pay the cost for in JS. Ironically though in this particular example one actually would be able to do much better when it comes to communicating type information to the JIT: ensure that you always call this function with Float64Arrays of equal size, and the JIT will know this and produce a faster loop (not vectorized, but still a lot better).

Now, one rarely uses typed arrays in practice because they're pretty heavy to initialize so only worth it if one allocates a large typed array one once and reuses them a lot aster that, so again, fair enough! One other detail does annoy me a little bit: the article says the example JS code is pretty bloated, but I bet that a big part of that is that the JS JIT can't guarantee that 65536 equals the length of the two arrays so will likely insert a guard. But nobody would write a for loop that way anyway, they'd write it as i < x.length, for which the JIT does optimize at least one array check away. I admit that this is nitpicking though.

uecker•8mo ago

You don't really need comptime to be able to inline and unroll a string comparison. This also works in C: https://godbolt.org/z/6edWbqnfT (edit: fixed typo)

Retro_Dev•8mo ago

Yep, you are correct! The first example was a bit too simplistic. A better one would be https://github.com/RetroDev256/comptime_suffix_automaton

Do note that your linked godbolt code actually demonstrates one of the two sub-par examples though.

uecker•8mo ago

I haven't looked at the more complex example, but the second issue is not too difficult to fix: https://godbolt.org/z/48T44PvzK

For complicated things, I haven't really understood the advantage compared to simply running a program at build time.

Cloudef•8mo ago

To be honest your snippet isn't really C anymore by using a compiler builtin. I'm also annoyed by things like `foo(int N, const char x[N])` which compilation vary wildly between compilers (most ignore them, gcc will actually try to check if the invariants if they are compile time known)

> I haven't really understood the advantage compared to simply running a program at build time.

Since both comptime and runtime code can be mixed, this gives you a lot of safety and control. The comptime in zig emulates the target architecture, this makes things like cross-compilation simply work. For program that generates code, you have to run that generator on the system that's compiling and the generator program itself has to be aware the target it's generating code for.

uecker•8mo ago

It also works with memcpy from the library: https://godbolt.org/z/Mc6M9dK4M I just didn't feel like burdening godbolt with an inlclude.

I do not understand your criticism of [N]. This gives compiler more information and catches errors. This is a good thing! Who could be annoyed by this: https://godbolt.org/z/EeadKhrE8 (of course, nowadays you could also define a descent span type in C)

The cross-compilation argument has some merit, but not enough to warrant the additional complexity IMHO. Compile-time computation will also have annoying limitations and makes programs more difficult to understand. I feel sorry for everybody who needs to maintain complex compile time code generation. Zig certainly does it better than C++ but still..

Cloudef•8mo ago

> I do not understand your criticism of [N]. This gives compiler more information and catches errors. This is a good thing!

It only does sane thing in GCC, in other compilers it does nothing and since it's very underspec'd it's rarely used in any C projects. It's shame Dennis's fat pointers / slices proposal was not accepted.

> warrant the additional complexity IMHO

In zig case the comptime reduces complexity, because it is simply zig. It's used to implement generics, you can call zig code compile time, create and return types.

This old talk from andrew really hammers in how zig is evolution of C: https://www.youtube.com/watch?v=Gv2I7qTux7g

uecker•8mo ago

Then the right thing would be to complain about those other compilers. I agree that Dennis' fat pointer proposal was good.

Also in Zig it does not reduce complexity but adds to it by creating an distinction between compile time and run-time. It is only lower complexity by comparing to other implementations of generic which are even worse.

Cloudef•8mo ago

Sure there's tradeoffs for everything, but if I had to choose between macros, templates, or zig's comptime, I'd take the comptime any time.

uecker•8mo ago

To each their own, I guess. I still find C to be so much cleaner than all the languages that attempt to replace it, I can not possibly see any of them as a future language for me. And it turns out that it is possible to fix issues in C if one is patient enough. Nowadays I would write this with a span type: https://godbolt.org/z/nvqf6eoK7 which is safe and gives good code.

update: clang is even a bit nicer https://godbolt.org/z/b99s1rMzh although both compile it to a constant if the other argument is known at compile time. In light of this, the Zig solution does not impress me much: https://godbolt.org/z/1dacacfzc

pjmlp•8mo ago

Not only it was a good proposal, since 1990 that WG14 has not done anything else into that sense, and doesn't look like it ever will.

uecker•8mo ago

Let's see. We have a relatively concrete plan to add dependent structure types to C2Y: struct foo { size_t n; char (buf)[.n]; };
Once we have this, the wide pointer could just be introduced as syntactic sugar for this. char (
buf)[:] = ..

Personally, I would want the dependent structure type first as it is more powerful and low-level with no need to decide on a new ABI.

int_19h•8mo ago

This feels like such a massive overkill complexity-wise for something so basic.

uecker•8mo ago

Why do you think so? The wide pointers are syntactic sugar on top of it, so from an implementation point of view not really simpler.

pjmlp•8mo ago

Thanks, interesting to see how it will turn out.

pron•8mo ago

C also creates a distinction between compile-time and run-time, which is more arcane and complicated than that of Zig's, and your code uses it, too: macros (and other pre-processor programming). And there are other distinctions that are more subtle, such as whether the source of a target function is available to the caller's compilation unit or not, static or not etc..

C only seems cleaner and simpler if you already know it well.

uecker•8mo ago

My point is not about whether compile-time programming is simpler in C or in Zig, but that is in most cases the wrong solution. My example is also not about compile time programming (and does not use macro: https://godbolt.org/z/Mc6M9dK4M), but about letting the optimizer do its job. The end result is then leaner than attempting to write a complicated compile time solution - I would argue.

pyrolistical•8mo ago

Right tool for the job. There was no comptime problem shown in the blog.

But if there were zig would prob be simpler since it uses one language that seamlessly weaves comptime and runtime together

uecker•8mo ago

I don't know, to me it seems the blog tries to make the case that comptime is useful for low-level optimization: "Is this not amazing? We just used comptime to make a function which compares a string against "Hello!\n", and the assembly will run much faster than the naive comparison function. It's unfortunately still not perfect." But it turns out that a C compiler will give you the "perfect" code directly while the comptime Zig version is fairly complicated. You can argue that this was just a bad example and that there are other examples where comptime makes more sense. The thing is, about two decades ago I was similarly excited about expression-template libraries for very similar reasons. So I can fully understand how the idea of "seamlessly weaves comptime and runtime together" can appear cool. I just realized at some point that it isn't actually all that useful.

pron•8mo ago

> But it turns out that a C compiler will give you the "perfect" code directly while the comptime Zig version is fairly complicated.

In this case both would (or could) give the "perfect" code without any explicit comptime programming.

> I just realized at some point that it isn't actually all that useful.

Except, again, C code often uses macros, which is a more cumbersome mechanism than comptime (and possibly less powerful; see, e.g. how Zig implements printf).

I agree that comptime isn't necessarily very useful for micro optimisation, but that's not what it's for. Being able to shift computations in time is usedful for more "algorithmic" macro optimisations, e.g. parsing things at compile time or generating de/serialization code.

uecker•8mo ago

Of course, a compiler could possibly also optimize the Zig code perfectly. The point is that the blogger did not understand it and instead created an overly complex solution which is not actually needed. Most C code I write or review does not use a lot of macros, and where they are used it seems perfectly fine to me.

quibono•8mo ago

Possibly a stupid question... what's a descent span type?

uecker•8mo ago

Something like this: https://godbolt.org/z/er9n6ToGP It encapsulates a pointer to an array and a length. It is not perfect because of some language limitation (which I hope we can remove), but also not to bad. One limitation is that you need to pass it a typedef name instead of any type, i.e. you may need a typedef first. But this is not terrible.

quibono•8mo ago

Thanks, this is great! I've been having a look at your noplate repo, I really like what you're doing there (though I need a minute trying to figure out the more arcane macros!)

uecker•8mo ago

In this case, the generic span type is just #define span(T) struct CONCAT(span_, T) { ssize_t N; T* data; } And the array to span macro would just create such an object form an array by storing the length of the array and the address of the first element. #define array2span(T, x) ({ auto __y = &(x); (span(T)){ array_lengthof(__y), &(__y)[0] }; })

justmarc•8mo ago

Optimization matters, in a huge way. Its effects are compounded by time.

sgt•8mo ago

Only if the software ends up being used.

el_pollo_diablo•8mo ago

> In fact, even state-of-art compilers will break language specifications (Clang assumes that all loops without side effects will terminate).

I don't doubt that compilers occasionally break language specs, but in that case Clang is correct, at least for C11 and later. From C11:

> An iteration statement whose controlling expression is not a constant expression, that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate.

tialaramex•8mo ago

C++ says (until the future C++ 26 is published) all loops, but as you noted C itself does not do this, only those "whose controlling expression is not a constant expression".

Thus in C the trivial infinite loop for (;;); is supposed to actually compile to an infinite loop, as it should with Rust's less opaque loop {} -- however LLVM is built by people who don't always remember they're not writing a C++ compiler, so Rust ran into places where they're like "infinite loop please" and LLVM says "Aha, C++ says those never happen, optimising accordingly" but er... that's the wrong language.

el_pollo_diablo•8mo ago

Sure, that sort of language-specific idiosyncrasy must be dealt with in the compiler's front-end. In TFA's C example, consider that their loop

  while (i <= x) {
      // ...
  }

just needs a slight transformation to

  while (1) {
      if (i > x)
          break;
      // ...
  }

and C11's special permission does not apply any more since the controlling expression has become constant.

Analyzes and optimizations in compiler backends often normalize those two loops to a common representation (e.g. control-flow graph) at some point, so whatever treatment that sees them differently must happen early on.

pjmlp•8mo ago

In theory, in practice it depends on the compiler.

It is no accident that there is ongoing discussion that clang should get its own IR, just like it happens with the other frontends, instead of spewing LLVM IR directly into the next phase.

kibwen•8mo ago

> Rust ran into places where they're like "infinite loop please" and LLVM says "Aha, C++ says those never happen, optimising accordingly" but er... that's the wrong language

Worth mentioning that LLVM 12 added first-class support for infinite loops without guaranteed forward progress, allowing this to be fixed: https://github.com/rust-lang/rust/issues/28728

loeg•8mo ago

For some context, 12 was released in April 2021. LLVM is now on 20 -- the versions have really accelerated in recent years.

username223•8mo ago

At least it's not just clownish version acceleration. They decided they wanted versions to increase faster somewhere around 2017-2018 (4.xx), and the version increase is more or less linear before and after that time, just at different slopes.

3836293648•8mo ago

And it's now a yearly major release, is it not?

Same as with GCC

username223•8mo ago

I can't say I'm happy with "yearly broken backward compatibility," but at least it's predictable.

loeg•8mo ago

Looks like two major versions per year for LLVM.

dustbunny•8mo ago

What interests me most by zig is the ease of the build system, cross compilation, and the goal of high iteration speed. I'm a gamedev, so I have performance requirements but I think most languages have sufficient performance for most of my requirements so it's not the #1 consideration for language choice for me.

I feel like I can write powerful code in any language, but the goal is to write code for a framework that is most future proof, so that you can maintain modular stuff for decades.

C/C++ has been the default answer for its omnipresent support. It feels like zig will be able to match that.

FlyingSnake•8mo ago

I recently, for fun, tried running zig on an ancient kindle device running stripped down Linux 4.1.15.

It was an interesting experience and I was pleasantly surprised by the maturity of Zig. Many things worked out of the box and I could even debug a strange bug using ancient GDB. Like you, I’m sold on Zig too.

I wrote about it here: https://news.ycombinator.com/item?id=44211041

osigurdson•8mo ago

I've dabbled in Rust, liked it, heard it was bad so kind of paused. Now trying it again and still like it. I don't really get why people hate it so much. Ugly generics - same thing in C# and Typescript. Borrow checker - makes sense if you have done low level stuff before.

int_19h•8mo ago

If you don't happen to come across some task that implies a data model that Rust is actively hostile towards (e.g. trees with backlinks, or more generally any kind of graph with cycles in it), borrow checker is not much of a hassle. But the moment you hit something like that, it becomes a massive pain, and requires either "unsafe" (which is strictly more dangerous than even C, never mind Zig) or patterns like using indices instead of pointers which are counter to high performance and effectively only serve to work around the borrow checker to shut it up.

carlmr•8mo ago

>requires either "unsafe" (which is strictly more dangerous than even C, never mind Zig)

Um, what? Unsafe Rust code still has a lot more safety checks applied than C.

>It’s important to understand that unsafe doesn’t turn off the borrow checker or disable any of Rust’s other safety checks: if you use a reference in unsafe code, it will still be checked. The unsafe keyword only gives you access to these five features that are then not checked by the compiler for memory safety. You’ll still get some degree of safety inside of an unsafe block.

https://doc.rust-lang.org/book/ch20-01-unsafe-rust.html

dwattttt•8mo ago

> the moment you hit something like that, it becomes a massive pain, and requires either "unsafe" (which is strictly more dangerous than even C, never mind Zig) or patterns like using indices instead of pointers

If you need to make everything in-house this is the experience. For the majority though, the moment you require those things you reach for a crate that solves those problems.

Ar-Curunir•8mo ago

I’m sorry, but your comment is a whole lot of horseshit.

Unsafe rust still has a bunch of checks C doesn’t have, and using indices into vectors is common code in high performance code (including Zig!)

creata•8mo ago

> patterns like using indices instead of pointers which are counter to high performance

Using indices isn't bad for performance. At the very least, it can massively cut down on memory usage (which is in turn good for performance) if you can use 16-bit or 32-bit indices instead of full 64-bit pointers.

> "unsafe" (which is strictly more dangerous than even C, never mind Zig)

Unsafe Rust is much safer than C.

The only way I can imagine unsafe Rust being more dangerous than C is that you need to keep exception safety in mind in Rust, but not in C.

whytevuhuni•8mo ago

Not quite, you also need to keep pointer non-nullness, alignment and aliasing safety in Rust, which is very pervasive in Rust (all shared/mutable references) but very rare in C (the 'restricted' keyword).

In Rust, it's not just using an invalid reference that causes UB, but their very creation, even if temporary. For example, since references have to always be aligned, the compiler can assume the pointer they were created from was also aligned, and so suddenly some ending bits from the pointer are ignored (since they must've been zero).

And usually the point of unsafe is to make safe wrappers, so unafe Rust makes or interacts with safe shared/mutable references pretty often.

creata•8mo ago

It's just hard for me to imagine someone accidentally messing up nonnullness or aliasing, because it's really in-your-face that you need to be careful when constructing a reference unsafely. There are even idiomatic methods like ptr::as_ref to avoid accidentally creating null references.

jplusequalt•8mo ago

>which is strictly more dangerous than even C, never mind Zig

No it's not? The Rust burrow checker, the backbone of Rust's memory safety model, doesn't stop working when you drop into an unsafe block. From the Rust Book:

>To switch to unsafe Rust, use the unsafe keyword and then start a new block that holds the unsafe code. You can take five actions in unsafe Rust that you can’t in safe Rust, which we call unsafe superpowers. Those superpowers include the ability to:

    Dereference a raw pointer
    Call an unsafe function or method
    Access or modify a mutable static variable
    Implement an unsafe trait
    Access fields of a union

It’s important to understand that unsafe doesn’t turn off the borrow checker or disable any of Rust’s other safety checks: if you use a reference in unsafe code, it will still be checked. The unsafe keyword only gives you access to these five features that are then not checked by the compiler for memory safety. You’ll still get some degree of safety inside of an unsafe block.

int_19h•8mo ago

The reason why it's more unsafe than C is because Rust makes a lot more assumptions about e.g. lack of aliasing that C does not, which are incredibly easy to violate once you have raw pointers.

Obviously if you can keep using references then it's not less safe, but if what you're doing can be done with references, why would you even be using `unsafe`?

jplusequalt•8mo ago

>using indices instead of pointers which are counter to high performance

Laughs in graphics programmer. You end up using indices to track data in buffers all the time when working with graphics APIs.

int_19h•8mo ago

I'm not disputing that there are circumstances in which indices are as good or even better.

At the same time, if using indices was universally better, then we'd just use indices everywhere, and low-level PLs like Rust would be designed around that from the get go. We don't do that for good reasons.

cornstalks•8mo ago

(This is a reply to multiple sibling comments, not the parent)

For those saying unsafe Rust is strictly safer than C, you're overlooking Rust's extremely strict invariants that users must uphold. These are much stricter than C, and they're extremely easy to accidentally break in unsafe Rust. Breaking them in unsafe Rust is instant UB, even before leaving the unsafe context.

This article has a decent summary in this particular section: https://zackoverflow.dev/writing/unsafe-rust-vs-zig/#unsafe-...

creata•8mo ago

The author seems to mostly be talking about the aliasing rules, but if you don't want to deal with those, can't you use UnsafeCell?

Imo, the more annoying part is dealing with exception safety. You need to ensure that your data structures are all in a valid state if any of your code (especially code in an unsafe block) panics, and it's easy to forget to ensure that.

Ygg2•8mo ago

For those thinking unsafe Rust is harder than C. C standard defined just 216 unsafe rules, that you need to keep in mind at all times.

sapiogram•8mo ago

Haters gonna hate. If you're working on a project that needs performance and correctness, nothing can get the job done like Rust.

LAC-Tech•8mo ago

unless you have to do anything that relies on a C API (such as provided by an OS) with no concept of ownership, then it's a massive task to get that working well with idiomatic rust. You need big glue layers to really make it work.

Rust is a general purpose language that can do systems programming. Zig is a systems programming language.

(Safety Coomers please don't downvote)

saghm•8mo ago

What does it even mean to be able to "do systems programming" but not actually be a "systems programming language"? I would directly disagree with you, but what you're arguing is so vague that I don't even know what you're trying to claim. The only way I can make sense of this is if you literally define a "systems programming language" as C and only other things that are tightly tied to it, which I guess is fine if you like tautologies but kind of makes even having a concept of " systems programming language" pretty useless.

bbkane•8mo ago

And yet Rust in the one in the Linux and Windows kernels, so people must think it's worth the effort. https://threadreaderapp.com/thread/1577667445719912450.html is certainly a glowing recommendation

LAC-Tech•8mo ago

Kernels are big pieces of software. Rust is used for device drivers mainly, right? So in that case you write an idiomatic rust lib and wrap it in a C interface and load it in.

Actually interfacing with idiomatic C APIs provided by an OS is something else entirely. You can see this is when you compare the Rust ecosystem to Zig; ie Zig has a fantastic io-uring library in the std lib, where as rust has a few scattered crates none of which come close the Zig's ease of use and integration.

One thing I'd like to see is an OS built with rust that could provide its own rusty interface to kernel stuff.

ArtixFox•8mo ago

Hello can you point me to more information about zig's and rust's io-uring implementations

LAC-Tech•8mo ago

Hey Artix!

Zig's is in the standard library. From the commits it was started by Joran from Tigerbeetle, and now maintained by mlugg who is a very talented zig programmer.

https://ziglang.org/documentation/master/std/#std.os.linux.I...

The popular Rust one is tokio's io-uring crate which 1) relies on libc; the zig one just uses their own stdlib which wraps syscalls 2) Requires all sorts of glue between safe and unsafe rust.

github.com/tokio-rs/io-uring

ArtixFox•8mo ago

thank you!

WD-42•8mo ago

The OS is called Redox.

LAC-Tech•8mo ago

It actually provides rust APIs to dev systems software against that run on it?

I know it's written in rust, but I am talking more specifically than that.

dgb23•8mo ago

Both are great languages. To me there's a philosophical difference, which can impact one to prefer one over the other:

Rust makes doing the wrong thing hard, Zig makes doing the right thing easy.

raincole•8mo ago

I wonder how zig works on consoles. Usually consoles hate anything that's not C/C++. But since zig can be transpiled to C, perhaps it's not completely ruled out?

jeroenhd•8mo ago

Consoles will run anything you compile for them. There are stable compilers for most languages for just about any console I know of, because modern consoles are pretty much either amd64 or aarch64 like phones and computers are.

Language limitations are more on the SDK side of things. SDKs are available under NDAs and even publicly available APIs are often proprietary. "Real" test hardware (as in developer kits) is expensive and subject to NDAs too.

If you don't pick the language the native SDK comes with (which is often C(++)), you'll have to write the language wrappers yourself, because practically no free, open, upstream project can maintain those bindings for you. Alternatively, you can pay a company that specializes in the process, like the developers behind Godot will tell you to do: https://docs.godotengine.org/en/stable/tutorials/platform/co...

I think Zig's easy C interop will make integration for Zig into gamedev quite attractive, but as the compiler still has bugs and the language itself is ever changing, I don't think any big companies will start developing games in Zig until the language stabilizes. Maybe some indie devs will use it, but it's still a risk to take.

haberman•8mo ago

> I feel like I can write powerful code in any language, but the goal is to write code for a framework that is most future proof, so that you can maintain modular stuff for decades.

I like Zig a lot, but long-term maintainability and modularity is one of its weakest points IMHO.

Zig is hostile to encapsulation. You cannot make struct members private: https://github.com/ziglang/zig/issues/9909#issuecomment-9426...

Key quote:

> The idea of private fields and getter/setter methods was popularized by Java, but it is an anti-pattern. Fields are there; they exist. They are the data that underpins any abstraction. My recommendation is to name fields carefully and leave them as part of the public API, carefully documenting what they do.

You cannot reasonably form API contracts (which are the foundation of software modularity) unless you can hide the internal representation. You need to be able to change the internal representation without breaking users.

Zig's position is that there should be no such thing as internal representation; you should publicly expose, document, and guarantee the behavior of your representation to all users.

I hope Zig reverses this decision someday and supports private fields.

eddd-ddde•8mo ago

Just prefix internal fields with underscore and be a big boy and don't access them from the outside.

If you really need to you can always use opaque pointers for the REALLY critical public APIs.

haberman•8mo ago

I am not the only user of my API, and I cannot control what users do.

My experience is that users who are trying to get work done will bypass every speed bump you put in the way and just access your internals directly.

If you "just" rely on them not to do that, then your internals will effectively be frozen forever.

nicoburns•8mo ago

> If you "just" rely on them not to do that, then your internals will effectively be frozen forever.

Or they will be broken when you change them and they upgrade. The JavaScript ecosystem uses this convention and generally if a field is prefixed by an underscore and/or documented as being non-public then you can expect to break in future versions (and this happens frequently in practice).

Not necessarily saying that's better, but it is another choice that's available.

lll-o-lll•8mo ago

Or you change it and respond with “You were warned”.

I seriously do not get this take. People use reflection and all kinds of hacks to get at internals, this should not stop you from changing said internals.

There will always be users who do the wrong thing.

jjmarr•8mo ago

Let's say I'm in a large company. Someone on some other team decided to rely on my implementation internals for a key revenue driver, and snuck it through code review.

I can't break their app without them complaining to my boss's boss's boss who will take their side because their app creates money for the company.

Having actual private fields doesn't 100% prevent this scenario, but it makes it less likely to sneak through code review before it becomes business-critical.

lll-o-lll•8mo ago

You can still create modules in zig, just use the standard handle pattern as you might in c/c++. I think that many of us have worked in “large company”, and the issue you describe is not resolved with the “private” keyword. You need to make your “component/module” with a well defined boundary (normally dll/library), a “public interface” and the internals not visible as symbols.

That doesn’t save you in languages that support reflection, but it will with zig. Inside a module, all private does is declare intent.

In languages with code inheritance, I think inheritance across module boundaries is now widely viewed as the anti-pattern that it is.

9d•8mo ago

[flagged]

lll-o-lll•8mo ago

Not everyone has to follow the MS approach of not breaking clients that rely on “undocumented” behavior. Document what will not be broken in future, change the rest and ignore the wailing.

It’s antithetical to what Zig is all about to hide the implementation. The whole idea is you can read the entire program without having to jump through abstractions 10 layers deep.

dustbunny•8mo ago

I don't care about public/private.

9d•8mo ago

Andrew has so many wrong takes. Unused variables is another.

Such a smart guy though, so I'm hesitant to say he's wrong. And maybe in the embedded space he's not, and if that's all Zig is for then fine. But internal code is a necessity of abstraction. I'm not saying it has to be C++ levels of abstraction. But there is a line between interface and implementation that ought to be kept. C headers are nearly perfect for this, letting you hide and rename and recast stuff differently than your .c file has, allowing you to change how stuff works internally.

Imagine if the Lua team wasn't free to make it significantly faster in recent 5.4 releases because they were tied to every internal field. We all benefited from their freedom to change how stuff works inside. Sorry Andrew but you're wrong here. Or at least you were 4 years ago. Hopefully you've changed your mind since.

haberman•8mo ago

I agree with almost all of this, including the point about c header files, except that code has to be in headers to be inlined (unless you use LTO), which in practice forces code into headers even if you’d prefer to keep it private.

keldaris•8mo ago

There's nothing wrong with using LTO, but I prefer simply compiling everything as a single translation unit ("unity builds"), which gets you all of the LTO benefits for free (in the sense that you still get fast compile times too).

philwelch•8mo ago

> I'm not saying it has to be C++ levels of abstraction. But there is a line between interface and implementation that ought to be kept. C headers are nearly perfect for this, letting you hide and rename and recast stuff differently than your .c file has, allowing you to change how stuff works internally.

Can’t you do this in Zig with modules? I thought that’s what the ‘pub’ keyword was for.

You can’t have private fields in a struct that’s publicly available but the same is sort of true in C too. OO style encapsulation isn’t the only way to skin a cat, or to send the cat a message to skin itself as the case may be.

9d•8mo ago

I don't know Zig so I dunno maybe

girvo•8mo ago

> But internal code is a necessity of abstraction

I just fundamentally disagree with this. Not having "proper" private methods/members has not once become a problem for me, but overuse of them absolutely has.

unclad5968•8mo ago

I disagree with plenty of Andrew's takes as well but I'm with him on private fields. I've never once in 10 years had an issue with a public field that should have been private, however I have had to hack/reimplement entire data structures because some library author thought that no user should touch some private field.

> You cannot reasonably form API contracts (which are the foundation of software modularity) unless you can hide the internal representation. You need to be able to change the internal representation without breaking users.

You never need to hide internal representations to form an "API contract". That doesn't even make sense. If you need to be able to change the internal representation without breaking user code, you're looking for opaque pointers, which have been the solution to this problem since at least C89, I assume earlier.

If you change your data structures or the procedures that operate on them, you're almost certain to break someone's code somewhere, regardless of whether or not you hide the implementation.

dgb23•8mo ago

> I've never once in 10 years had an issue with a public field that should have been private, however I have had to hack/reimplement entire data structures because some library author thought that no user should touch some private field.

Very similar experience here. Also just recently I really _had_ to use and extend the "internal" part of a legacy library. So potentially days or more than a week of work turned into a couple of hours.

haberman•8mo ago

Most data structures have invariants that must hold for the data structure to behave correctly. If users can directly read and write members, there's no way for the public APIs to guarantee that they will uphold their documented API behaviors.

Take something as simple as a vector (eg. std::vector in C++). If a user directly sets the size or capacity, the calls to methods like push_back() will behave incorrectly, or may even crash.

Opaque pointers are one way of hiding representation, but they also eliminate the possibility of inlining, unless LTO is in use. If you have members that need to be accessible in inline functions, it's impossible to use opaque pointers.

There is certainly a risk of "implicit interfaces" (Hyrum's Law), where users break even when you're changing the internals, but we can lessen the risk by encapsulating data structures as much as possible. There are other strategies for lessening this risk, like randomizing unspecified behaviors, so that people cannot take dependencies on behaviors that are not guaranteed.

xboxnolifes•8mo ago

> Most data structures have invariants that must hold for the data structure to behave correctly. If users can directly read and write members, there's no way for the public APIs to guarantee that they will uphold their documented API behaviors.

You can, just not in the "strictly technical" sense. You add a "warranty void if these fields are touched" documentation string.

Ygg2•8mo ago

That's honestly horrible. It's like finding your job is guaranteed by a pinkie promise, or the equivalent.

xboxnolifes•8mo ago

Most of the world runs on a handshake.

Ygg2•8mo ago

That's not a valid argument. For most of human existence there was cannibalism and/or human sacrifices. This doesn't mean we should go back to it.

jcelerier•8mo ago

isn't that the norm in many places on earth?

pjmlp•8mo ago

I prefer liability when devs misuse software with consequences for society infrastructure.

xboxnolifes•8mo ago

A language adding private fields does not add liability.

pjmlp•8mo ago

Indeed, misusing the library and causing software faults does, so every stone in the way preventing misuse helps.

the__alchemist•8mo ago

Like unclad, I disagree that not having private fields is a problem. I think this comes down to programming style. For an OOP style (Just one example), I can see how that would be irritating. Here's my anecdote:

I write a lot of rust. By default, fields are private. It's rare to see a field in my code that omits the `pub` prefix. I sometimes start with private because I forget `pub`, but inevitably I need to make it public!

I like in principle they're there, but in practice, `pub` feels like syntactic clutter, because it's on all my fields! I think this is because I use structs as abstract bags of data, vice patterns with getters/setters.

When using libraries that rely on private fields, I sometimes have to fork them so I can get at the data. If they do provide a way, it makes the (auto-generated) docs less usable than if the fields were public.

I suspect this might come down to the perspective of application/firmware development vice lib development. The few times I do use private fields have been in libs. E.g. if you have matrix you generate from pub fields and similar.

pjmlp•8mo ago

One the key principles for modular software is encapsulation, it predates OOP by decades, and at least even C got that correct.

Majora320•8mo ago

This is only a problem if you can't modify the library you're using for whatever reason (usually a bad one). If you have the source of all your dependencies, you can just fork and add methods as needed in the rare cases where you need to do this.

mwkaufma•8mo ago

> You need to be able to change the internal representation without breaking users.

Unless the user only links an opaque pointer, then just changing the sizeof() is breaking, even if the fields in question are hidden. A simple doc comment indicating that "fields starting with _ are not guaranteed to be minor-version-stable" or somesuch is a perfectly "reasonable" API.

Dylan16807•8mo ago

The chance of someone relying on the size at an API level is extremely small. That's far less risky than exposing every field.

nevi-me•8mo ago

I'd imagine semantic versioning to be more subjective with a language that relies on a social contract, because if a user chooses to use those private fields, a minor update or patch could break their code.

It does feel regressive to me. I've seen people easily reach for underscored fields in Python. We can discourage them if the code is reviewed, but then again there's also people who write everything without underscores.

dgb23•8mo ago

Some years ago I started to just not care about setting things to "private" (in any language). And I care _a lot_ about long term maintainability and breakage. I haven't regretted it since.

> You cannot reasonably form API contracts (...) unless you can hide the internal representation.

Yes you can, by communicating the intended use can be made with comments/docstrings, examples etc.

One thing I learned from the Clojure world, is to have a separate namespace/package or just section of code, that represents an API that is well documented, nice to use and more importantly stable. That's really all that is needed.

(Also, there are cases where you actually need to use a thing in a way that was not intended. That obviously comes with risk, but when you need it, you're _extremely_ glad that you can.)

haberman•8mo ago

I have the opposite experience. Several years ago I didn't worry too much about people using private variables.

Then I noticed people were using them, preventing me from making important changes. So I created a pseudo-"private" facility using macros, where people had to write FOOLIB_PRIVATE(var) to get at the internal var.

Then I noticed (I kid you not) people started writing FOOLIB_PRIVATE(var) in their own code. Completely circumventing my attempt to hide these internal members. And I can't entirely blame them, they were trying to get something done, and they felt it was the fastest way to do it.

After this experience, I consider it an absolute requirement to have a real "private" struct member facility in a language.

I respect Andrew and I think he's done a hell of a job with Zig. I also understand the concern with the Java precedent and lots of wordy getters/setters around trivial variables. But I feel like Rust (and even C++) is a great counterexample that private struct variables can be done in a reasonable way. Most of the time there's no need to have getters/setters for every individual struct member.

raincole•8mo ago

> And I can't entirely blame them

You can't blame them, but they can't blame you if you break their code.

tayo42•8mo ago

That's pretty much why I never bother with the underscore prefix convention when using python. If someone wants to use it they'll do it anyway.

geysersam•8mo ago

It's about the contract with the users. I don't think you should worry about breaking someone using the private fields of your classes. Making a field private, for example by prefixing an underscore in Python, tells the users "for future maintainability of the software I allow myself the right to change this field without warning, use at your own peril".

If you hesitate changing it because you worry about users using it anyway you are hurting the fraction of your users who are not using it.

haberman•8mo ago

This is company code in a monorepo. If a change breaks users, it will simply be rolled back.

Everyone is brainstorming ways to work around Zig's lack of "private". But nobody has a good answer for why Zig can't just add "private" to the language. If we agree that users shouldn't touch the private variables, why not just have the language enforce it?

SpaghettiCthulu•8mo ago

Because sometimes the user really wants to access those fields, and if the language enforces them being private, the user will either copy-paste your code into their project, or fork your project and make the fields public there. And now they have a lot of extra work to stay up-to-date when compared to just making the necessary changes if those fields ever change had they been public.

haberman•8mo ago

I would be satisfied if the language supported this use case by offering a “void my warranty” annotation that let a given source file access the privates of a given import.

Companies with monorepos could easily just ban the annotation. OSS projects could easily close any user complaint if the repro requires the annotation.

This seems like a great compromise to me. It would let you unambiguously mark which parts of the api are private, in a machine checkable way, which is undoubtedly better than putting it into comments. But it would offer an escape hatch for people who don’t mind voiding their warranty.

pjmlp•8mo ago

That is the beauty of binary libraries, they enforce encapsulation.

the8472•8mo ago

> or fork your project

If they want to ignore the API contract then that's the right response. The maintainer chose one thing to preserve their ability to provide non-breaking updates. The user doesn't care about that, now it's on them to maintain that code which they're sinking their probes into.

geysersam•8mo ago

> If we agree that users shouldn't touch the private variables, why not just have the language enforce it?

Thing is, I don't have an opinion about what users should do. That's entirely up to them and the trade offs they make in their contexts. There are scenarios where you might want to access a private field.

But it's also a question about simplicity, adding private to the language makes it bigger without imo contributing anything of practical value that can't be achieved with convention.

magicalhippo•8mo ago

I started using Boost's approach, that is keep those things public but in their own clearly-named internal namespace (be it an actual namespace or otherwise).

This way users can get to them if they really need to, say for a crucial bug fix, but they're also clearly an implementation detail so you're free to change it without users getting surprised when things break etc.

pjmlp•8mo ago

C++ precedent though, getters and setters were widely adopted in C++ frameworks before Java was even an idea.

josephg•8mo ago

> Then I noticed (I kid you not) people started writing FOOLIB_PRIVATE(var) in their own code.

If it’s in an internal monorepo, this should be super easy to fix using grep.

Honestly it sounds like a great opportunity to improve your API. If people are going out of their way to access something that you consider private, it’s probably because your public APIs aren’t covering some use case that people care about. That or you need better documentation. Sometimes even a comment helps:

    int _foo; // private. See getFoo() to read.

I get that it’s annoying, but finding and fixing internal code like this should be a 15 minute job.

jcelerier•8mo ago

> After this experience, I consider it an absolute requirement to have a real "private" struct member facility in a language.

I think that's the wrong take to have. Life is much easier when you accept the reality of a world where people will do whatever they want with what you give them.

C++ has private, and so what? I've seen #define private public or even -Dprivate=public, I've seen classes with private implementation detail reimplemented with another name and all fields public & then casted, I've seen accessing types as char arrays and binary operations to circumvent this, I've seen accessing the process raw memory pages. If someone other than you can call the code, it's not yours anymore to decide what can be done with it.

What you don't owe anyone is the guarantee of things working if people stray from the happy path you outline - they want help after going astray, give them your hourly rate on fixing their mistakes.

ants_everywhere•8mo ago

You're getting a lot of responses with very strong opinions from people who talk as if they've never had to care about customers relying on their APIs.

josephg•8mo ago

It’s a trust thing.

If you can trust that downstream users of your api won’t misuse private-by-convention fields (or won’t punish you for doing so), it’s not a problem. That works a lot of the time: You can trust yourself. You can usually your team. In the opensource world, you can just break compatibility with no repercussions.

But yes, sometimes that trust isn’t there. Sometimes you have customers who will misuse your code and blame you for it. But that isn’t the case for all code. Or even most code.

LAC-Tech•8mo ago

The solution to this is to simply put an underscore before the variables you don't think others should rely on, then move on with your life.

jenadine•8mo ago

From my understanding, making stable API is impossible in Zig anyway, since Zig itself is still making breaking changes at the language level

flohofwoe•8mo ago

> Zig is hostile to encapsulation. You cannot make struct members private

In Zig (and plenty of other non-OOP languages) modules are the mechanism for encapsulation, not structs. E.g. don't make the public/private boundary inside a struct, that's a silly thing anyway if you think about it - why would one ever hand out data to a module user which is not public - just to tunnel it back into that same module later?

Instead keep your private data and code inside a module by not declaring it public, or alternatively: don't try to carry over bad ideas from C++/Java, sometimes it's better to unlearn things ;)

the__alchemist•8mo ago

Concur. Or, the in-between: Set the structs to be private if you need. I make heavy use of private structs and modules, but rarely private fields.

jandrewrogers•8mo ago

I think the bigger issue with "public" and "private" is that is insufficiently granular, being essentially all or nothing. The use of those APIs in various parts of the code base is not self-documenting. Hyrum's Law is undefeated.

C++ has the PassKey idiom that allows you to whitelist what objects are allowed to access each part of the public API at compile-time. This is a significant improvement but a pain to manage for complex whitelists because the language wasn't designed with this in mind. C++26 has added language features specifically to make this idiom scale more naturally.

I'd love to see more explicit ACLs on APIs as a general programming language feature.

flohofwoe•8mo ago

> I'd love to see more explicit ACLs on APIs as a general programming language feature.

In that I agree, but per-member public/private/protected is a dead end.

I'd like a high level language which explores organizing all application data in a single, globally accessible nested struct and filesystem-like access rights into 'paths' of this global struct (read-only, read-write or opaque) for specific parts of the code.

Probably a bit too radical to ever become mainstream (because there's still this "global state == bad" meme - it doesn't have to be evil with proper access control - and it would radically simplify a lot of programs because you don't need to control access by passing 'secret pointers' around).

cobbal•8mo ago

Why would you hand out data that gets tunneled back in?

There are lots of use cases for this exact pattern. An acceleration structure to speed up searching complex geometry. The internal state of a streaming parser. A lazy cache of an expensive property that has a convenient accessor. An unsafe pointer that the struct provides consistent, threadsafe access patterns for. I've used this pattern for all these things, and there are many more uses for encapsulation. It's not just an OO concern.

pdpi•8mo ago

> The idea of private fields and getter/setter methods was popularized by Java, but it is an anti-pattern.

I agree with this part with no reservations. The idea that getters/setters provide any sort of abstraction or encapsulation at all is sheer nonsense, and is at the root of many of the absurdities you see in Java.

The issue, of course, is that Zig throws out the baby with the bath water. If I want, say, my linked list to have an O(1) length operation, i need to maintain a length field, but the invariant that list.length actually lines up with the length of the list is something that all of the other operations need to maintain. Having that field be writable from the outside is just begging for mistakes. All it takes is list.length = 0 instead of list.length == 0 to screw things up badly.

ArtixFox•8mo ago

You can have a debug time check.

sramsay64•8mo ago

I think I mostly agree, but I do have one war story of using a C++ library (Apache Avro) that parsed data and exposed a "get next std::string" method. When parsing a file, all the data was set to the last string in the file. I could see each string being returned correctly in a debugger, but once the next call to that method was made, all previous local variables were now set to the new string. Never looked too far into it but it seemed pretty clear that there was a bug in that library that was messing with the internals of std::string, (which if I understand is just a pointer to data). It was likely re-using the same data buffer to store the data for different std::string objects which shouldn't be possible (under the std::string "API contract"). It was a pain to debug because of how "private" std::string's internals are.

In other words, we can at best form API contracts in C++ that work 99% of the time.

jandrewrogers•8mo ago

FWIW, the std::string buffer is directly accessible for (re-)writing via the public API. You don't need to use any private access to do this.

pif•8mo ago

You are right. Don't listen to the idiots!

voidfunc•8mo ago

How is this any different than Python or Ruby? You can access internals easily and people don't have a problem writing maintainable modular software in those languages.

Not to mention just about every language offers runtime reflection that let's you do bad stuff.

IMO, the Python adage of "We are all consenting adults here" applies.

Galanwe•8mo ago

> You cannot reasonably form API contracts (which are the foundation of software modularity) unless you can hide the internal representation

Python is a good counter example IMHO, the simple convention of having private fields prefixed with _/__ is enough of a deterrent, you don't need language support.

gf000•8mo ago

I believe private fields are a feature that actually increases the expressivity of a language, as per the formal definition. This one can't be replaced by some trivial, local syntactic sugar.

Of course increasing expressivity is not the end goal in itself for a PL, but I do agree with you that this (and some other, like no unused variable - that one drives me up a wall) design choice makes me less excited about the language as I would otherwise be.

wg0•8mo ago

Zig seems to be simpler Rust and better Go.

Off topic - One tool built on top of Zig that I really really admire is bun.

I cannot tell how much simpler my life is after using bun.

Similar things can be said for uv which is built in Rust.

FlyingSnake•8mo ago

Zig is nothing like Go. Go uses GC and a runtime while Zig has none. While Zig’s functions aren’t coloured, it lacked the CSP style primitives like goroutines and channels.

9d•8mo ago

Zig is like a highly opinionated modern C

Rust is like a highly opinionated modern C++

Go is like a highly opinionated pre-modern C with GC

cgh•8mo ago

In a previous comment, you remarked you don’t even know Zig.

9d•8mo ago

I don't.

LexiMax•8mo ago

I do. I find his osmosis-based summation accurate.

9d•8mo ago

Yay!

gf000•8mo ago

Go should be as much in this discussion as JavaScript.

9d•8mo ago

> C/C++ has been the default

You're not really going to make something better than C. If you try, it will most likely become C++ anyway. But do try anyway. Rust and Zig are evidence that we still dream that we can do better than C and C++.

Anyway I'm gonna go learn C++.

flohofwoe•8mo ago

C++ has been piling more new problems on top of C than it inherited from C in the first place (and C++ is now caught in a cycle of trying to fix problems it introduced a couple of versions ago).

Creating a better C successor than C++ is really not a high bar.

timewizard•8mo ago

That for loop syntax is horrendous.

So I have two lists, side by side, and the position of items in one list matches positions of items in the other? That just makes my eyes hurt.

I think modern languages took a wrong turn by adding all this "magic" in the parser and all these little sigils dotted all around the code. This is not something I would want to look at for hours at a time.

int_19h•8mo ago

Such arrays are an extremely common pattern in low-level code regardless of language, and so is iterating them in parallel, so it's natural for Zig to provide a convenient syntax to do exactly that in a way that makes it clear what's going on (which IMO it does very well). Why does it make your eyes hurt?

timewizard•8mo ago

It looks to me like:

   for (one, two, three) |uno, dos, tres| { ... }

My eyes have to bounce back and forth between the two lists. When the identifiers are longer than this example it increases eye strain. Maybe it's better when you wrote it and understand it, but trying to grok someone else's code, it feels like an obstacle to me.

csjh•8mo ago

> High level languages lack something that low level languages have in great adundance - intent.

Is this line really true? I feel like expressing intent isn't really a factor in the high level / low level spectrum. If anything, more ways of expressing intent in more detail should contribute towards them being higher level.

wk_end•8mo ago

I agree with you and would go further: the fundamental difference between high-level and low-level languages is that in high-level languages you express intent whereas in low-level languages you are stuck resorting to expressing underlying mechanisms.

jeroenhd•8mo ago

I think this isn't referring to intent as in "calculate the tax rate for this purchase" but rather "shift this byte three positions to the left". Less about what you're trying to accomplish, and more about what you're trying to make the machine do.

Something like purchase.calculate_tax().await.map_err(|e| TaxCalculationError { source: e })?; is full of intent, but you have no idea what kind of machine code you're going to end up with.

csjh•8mo ago

Maybe, but from the author's description, it seems like the interpretation of intent that they want is to generally give the most information possible to the compiler, so it can do its thing. I don't see why the right high level language couldn't give the compiler plenty of leeway to optimize.

raincole•8mo ago

In other words, high-level languages express high-level intents, while low-level languages express low-level intents.

In yet other words, tautology.

9d•8mo ago

> People will still mistakenly say "C is faster than Python", when the language isn't what they are benchmarking.

Yeah but some language features are disproportionately more difficult to optimize. It can be done, but with the right language, the right concept is expressed very quickly and elegantly, both by the programmer and the compiler.

WalterBright•8mo ago

> Rust's memory model allows the compiler to always assume that function arguments never alias. You must manually specify this in Zig.

I've avoided such manual specification of aliasing because:

1. few people understand it

2. using it erroneously can result in baffling bugs in your code

WalterBright•8mo ago

> The flexibility of Zig's comptime has resulted in some rather nice improvements in other programming languages.

Compile time function execution and functions with constant arguments were introduced in D in 2007, and resulted in many other languages adopting something similar.

https://dlang.org/spec/function.html#interpretation

kamma4434•8mo ago

I know nothing of Zig, but I worked long enough in lisp to know that the best macros are the ones you don’t write. They are wonderful but they have just as many drawbacks, and don’t compose nicely.

The better you get at something, the harder it becomes to do

Show HN: WP Float – Archive WordPress blogs to free static hosting

Show HN: I Hacked My Family's Meal Planning with an App

Sony BMG copy protection rootkit scandal

The Future of Systems

NASA now allowing astronauts to bring their smartphones on space missions

Claude Code Is the Inflection Point

Show HN: MicroClaw – Agentic AI Assistant for Telegram, Built in Rust

Show HN: Omni-BLAS – 4x faster matrix multiplication via Monte Carlo sampling

The AI-Ready Software Developer: Conclusion – Same Game, Different Dice

AI Agent Automates Google Stock Analysis from Financial Reports

Voxtral Realtime 4B Pure C Implementation

I Was Trapped in Chinese Mafia Crypto Slavery [video]

U.S. CBP Reported Employee Arrests (FY2020 – FYTD)

Show HN: I built a free UCP checker – see if AI agents can find your store

Show HN: SVGV – A Real-Time Vector Video Format for Budget Hardware

Study of 150 developers shows AI generated code no harder to maintain long term

Spotify now requires premium accounts for developer mode API access

When Albert Einstein Moved to Princeton

Agents.md as a Dark Signal

System time, clocks, and their syncing in macOS

McCLIM and 7GUIs – Part 1: The Counter

So whats the next word, then? Almost-no-math intro to transformer models

Ed Zitron: The Hater's Guide to Microsoft

UK infants ill after drinking contaminated baby formula of Nestle and Danone

Show HN: Android-based audio player for seniors – Homer Audio Player

Starter Template for Ory Kratos

LLMs are powerful, but enterprises are deterministic by nature

Make your iPad 3 a touchscreen for your computer

Internationalization and Localization in the Age of Agents

The better you get at something, the harder it becomes to do

Show HN: WP Float – Archive WordPress blogs to free static hosting

Show HN: I Hacked My Family's Meal Planning with an App

Sony BMG copy protection rootkit scandal

The Future of Systems

NASA now allowing astronauts to bring their smartphones on space missions

Claude Code Is the Inflection Point

Show HN: MicroClaw – Agentic AI Assistant for Telegram, Built in Rust

Show HN: Omni-BLAS – 4x faster matrix multiplication via Monte Carlo sampling

The AI-Ready Software Developer: Conclusion – Same Game, Different Dice

AI Agent Automates Google Stock Analysis from Financial Reports

Voxtral Realtime 4B Pure C Implementation

I Was Trapped in Chinese Mafia Crypto Slavery [video]

U.S. CBP Reported Employee Arrests (FY2020 – FYTD)

Show HN: I built a free UCP checker – see if AI agents can find your store

Show HN: SVGV – A Real-Time Vector Video Format for Budget Hardware

Study of 150 developers shows AI generated code no harder to maintain long term

Spotify now requires premium accounts for developer mode API access

When Albert Einstein Moved to Princeton

Agents.md as a Dark Signal

System time, clocks, and their syncing in macOS

McCLIM and 7GUIs – Part 1: The Counter

So whats the next word, then? Almost-no-math intro to transformer models

Ed Zitron: The Hater's Guide to Microsoft

UK infants ill after drinking contaminated baby formula of Nestle and Danone

Show HN: Android-based audio player for seniors – Homer Audio Player

Starter Template for Ory Kratos

LLMs are powerful, but enterprises are deterministic by nature

Make your iPad 3 a touchscreen for your computer

Internationalization and Localization in the Age of Agents

Low-Level Optimization with Zig

Comments