> With this last improvement Zig has completely defeated function coloring.
I disagree with this. Let's look at the 5 rules referenced in the famous "What color is your function?" article referenced here.
> 1. Every function has a color
Well, you don't have async/sync/red/blue anymore, but you now have IO and non-IO functions.
> 2. The way you call a function depends on its color.
Now, technically this seems to be solved, but you still need to provide IO as a parameter. Non-IO functions don't need/take it.
It looks like a regular function call, but there's no real difference.
> 3. You can only call a red function from within another red function
This still applies. You can only call IO functions from within other IO functions.
Technically you could pass in a new executor, but is that really what you want? Not to mention that you can also do this in languages that don't claim to solve the coloring problem.
> 4. Red functions are more painful to call
I think the spirit still applies here.
> 5. Some core library functions are red
This one is really about some things being only possible to implement in the language and/or stdlib. I don't think this applies to Zig, but it doesn't apply to Rust either for instance.
Now, I think these rules need some tweaking, but the general problem behind function coloring is that of context. Your function needs some context (an async executor, auth information, an allocator, ...). In order to call such a function you also need to provide the context. Zig hasn't really solved this.
That being said, I don't think Zig's implementation here is bad. If anything, it does a great job at abstracting the usage from the implementation. This is something Rust fails at spectacularly.
However, the coloring problem hasn't really been defeated.
Given an `io` you can, technically, build another one from it with the same interface.
For example given an async IO runime, you could create an `io` object that is blocking (awaits every command eagerly). That's not too special - you can call sync functions from async functions. (But in JavaScript you'd have trouble calling a sync function that relies on `await`s inside, so that's still something).
Another thing that is interesting is given a blocking posix I/O that also allows for creating processes or threads, you could build in userspace a truly asynchronous `io` object from that blocking one. It wouldn't be as efficient as one based directly on iouring, and it would be old school, but it would basically work.
Going either way (changing `io` to sync or async) the caller doesn't actually care. Yes the caller needs a context, but most modern apps rely on some form of dependency injection. Most well-factored apps would probably benefit from a more refined and domain-specific "environment" (or set of platform effects, perhaps to use the Roc terminology), not Zig's posix-flavoured standard library `io` thing.
Yes rust achieves this to some extent; you can swap an async runtime for another and your app might still compile and run fine.
Overall I like this alot - I am wondering if Richard Feldmann managed to convince Andrew Kelley that "platforms" are cool and some ideas were borrowed from Roc?
Does Zig actually do anything here? If anything, this seems to be anti-Zig, where everything must be explicit.
* It's quite rare for a function to unexpectedly gain a dependency on "doing IO" in general. In practice, most of your codebase will have access to an `Io`, and only leaf functions doing pure computation will not need them.
* If a function does start needing to do IO, it almost certainly doesn't need to actually take it as a parameter. As in many languages, it's typical in Zig code to have one type which manages a bunch of core state, and which the whole codebase has easy access to (e.g. in the Zig compiler itself, this is the `Compilation` type). Because of this, despite the perception, Zig code doesn't usually pass (for instance) allocators explicitly all the way down the function call graph! Instead, your "general purpose allocator" is available on that "application state" type, so you can fetch it from essentially wherever. IO will work just the same in practice. So, if you discover that a code path you previously thought was pure actually does need to perform IO, then you don't need to apply some nasty viral change; you just grab `my_thing.io`.
I do agree that in principle, there's still a form of function coloring going on. Arguably, our solution to the problem is just to color every function async-colored (by giving most or all of them access to an `Io`). But it's much like the "coloring" of having to pass `Allocator`s around: it's not a problem in practice, because you'll basically always have easy access to one even if you didn't previously think you'd need it. I think seasoned Zig developers will pretty invariably agree with the statement that explicitly passing `Allocator`s around really does not introduce function coloring annoyances in practice, and I see no reason that `Io` would be particularly different.
If this was true in general, the function coloring problem wouldn't be talked about.
However, the second point is more interesting. I think there's a bit of a Stockholm syndrome thing here with Zig programmers and Allocator. It's likely that Zig programmers won't mind passing around an extra param.
If anything, it would make sense to me to have IO contain an allocator too. Allocation is a kind of IO too. But I guess it's going to be 2 params from now on.
>> So, if you discover that a code path you previously thought was pure actually does need to perform IO, then you don't need to apply some nasty viral change; you just grab `my_thing.io
From the code sample it looks like printing to stdio will now require an Io param. So won’t you now have to pass that down to wherever you want to do a quick debug printf?
As for the second thing:
You can do that, but... You can also do this in Rust. Yet nobody would say Rust has solved function coloring.
Also, check this part of the article:
> In the less common case when a program instantiates more than one Io implementation, virtual calls done through the Io interface will not be de-virtualized, ...
Doing that is an instant performance hit. Not to mention annoying to do.
add io to a struct and let the struct keep track of its own io.
You can't pass around "async/await" as a value attached to another object. You can do that with the IO param. That is very different.
you can call a function that requires an io parameter from a function that doesn't have one by passing in a global io instance?
as a trivial example the fn main entrypoint in zig will never take an io parameter... how do you suppose you'd bootstrap the io parameter that you'd eventually need. this is unlike other languages where main might or might not be async.
How will that work with code mixing different Io implementations? Say a library pulled in uses a global Io instance while the calling code is using another.
I guess this can just be shot down with "don't do that" but it feels like a new kind of pitfall get get into.
it'll probably carry a stigma like using unsafe does.
For Zig users, adopting this same mindset for Io is not really anything new. It's just another parameter that occasionally needs to be passed into an API.
If you’re working with goroutines, you would always pass in a context parameter to handle cancellation. Many library functions also require context, which poisons the rest of your functions.
Technically, you don’t have to use context for a goroutine and could stub every dependency with context.Background, but that’s very discouraged.
I don't really think it is fully the coloring problem because you can easily call non-context functions from context functions (but not other way around, so one way coloring issue), but you need to be aware the cancellation chain of course stops then.
Another approach is special messages over a side channel.
var io: std.Io = undefined;
pub fn main() !void {
var impl = ...;
io = impl.io();
}
Just put io in a global variable and you won't have to worry about coloring in your application. Are your functions blue, red or green now?Jokes aside, I agree that there's obviously a non-zero amount of friction to using the `Io` intreface, but it's something qualitatively very different from what causes actual real-world friction around the use of async await.
> but the general problem behind function coloring is that of context
I would disagree, to me the problem seems, from a practical perspective that:
1. Code can't be reused because the async keyword statically colors a function as red (e.g. python's blocking redis client and asyncio-redis). In Zig any function that wants to do Io, be it blue (non-async) or red (async) still has to take in that parameter so from that perspective the Io argument is irrelevant.
2. Using async and await opts you automatically into stackless coroutines with no way of preventing that. With this new I/O system even if you decide to use a library that interally uses async, you can still do blocking I/O, if you want.
To me these seems the real problems of function coloring.
> use of two different implementations of io
functionally rare situation.
> 1. Code can't be reused because the async keyword statically colors a function
This is fair. And it's also a real pain point with Rust. However, it's funny that the "What color is your function?" article doesn't even really mention this.
> 2. Using async and await opts you automatically into stackless coroutines with no way of preventing that
This however I don't think is true. Async/await is mostly syntax sugar.
In Rust and C# it uses stackless coroutines.
In JS it uses callbacks.
There's nothing preventing you from making await suspend a green thread.
Which means that if I want to use a dependency that uses async await, it's stackless coroutines for me too whether I like it or not.
> However, the coloring problem hasn't really been defeated.
Well, yes, but if the only way to do I/O were to have an Io instance to do it with then Io would infect all but pure(ish, non-Io) functions, so calling Io functions would be possible in all but those contexts where calling Io functions is explicitly something you don't want to be possible.
So in a way the color problem is lessened.
And on top of that you get something like Haskell's IO monad (ok, no monad, but an IO interface). Not too shabby, though you're right of course.
Next Zig will want monadic interfaces so that functions only have to have one special argument that can then be hidden.
why does it have to be new? just use one executor, set it as const in some file, and use that one at every entrypoint that needs io! now your io doesn't propagate downwards.
It was an interesting read, but I guess I came away confused about why "coloring" functions is a problem. Isn't "coloring" just another form of static typing? By giving the compiler (or interpreter) more meta data about your code, it can help you avoid mistakes. But instead of the usual "first argument is an integer" type meta data, "coloring" provides useful information like: "this function behaves in this special way" or "this function can be called in these kinds of contexts." Seems reasonable?
Like the author seems very perturbed that there can be different "colors" of functions, but a function that merely calculates (without any IO or side-effects) is different than one that does perform IO. A function with only synchronous code behaves very differently than one that runs code inside another thread or in a different tick of the event loop. Why is it bad to have functions annotated with this meta data? The functions behave in a fundamentally different way whether you give them special annotations/syntax or not. Shouldn't different things look different?
He mentions 2015 era Java as being ok, but as someone that’s written a lot of multithreaded Java code, it’s easy to mess up and people spam the “synchronized” keyword/“color” everywhere as a result. I don’t feel the lack of colors in Java makes it particularly intuitive or conceptually simpler.
https://www.open-std.org/JTC1/SC22/WG21/docs/papers/2018/p13...
> While fibers may have looked like an attractive approach to write scalable concurrent code in the 90s, the experience of using fibers, the advances in operating systems, hardware and compiler technology (stackless coroutines), made them no longer a recommended facility.
If they go through with this, Zig will probably top out at "only as fast as Go", instead of being a true performance competitor. I at least hope the old std.fs sticks around for cases where performance matters.
And would Rust be "all-in" if tokio was in std, so you could use its tasks everywhere? That would be a very similar level of "all-in" to Zig's current plan, but with a seemingly better API.
I understand the benefit of not being in std, but really not a fundamental issue, IMO.
And green thread exists in language?
The point here is that "async stuff is IO stuff is async stuff". So rather than thinking of having pluggable async runtimes (tokio, etc) Zig is going with pluggable IO runtimes (which is kinda the equivalent of "which subset of libc do you want to use?").
But in both moves the idea is to remove the runtime out of the language and into userspace, while still providing a common pluggable interface so everyone shares some common ground.
Performance matters; we're not planning to forget that. If fibers turn out to have unacceptable performance characteristics, then they won't become a widely used implementation. Nothing discussed in this article precludes stackless coroutines from backing the "general purpose" `Io` implementation if that turns out to be the best approach.
e.g. I'd feel a lot more confident if he had made the coroutine language feature a hard dependency of the writergate refactor.
Andrew, I know you read these threads sometimes, give us a sign so I can go down the mountain with my stone tablets and tell the people whether we'll have coroutines
While Andrew has the final say, as Loris points out, we always work to reach a consensus internally. The article lists this an an implementation that will probably exist, because we agree that it probably will; nobody is promising it, because we also agree that it isn't guaranteed.
Also, bear in mind that even if stackless coroutines don't make it into Zig, you can always use a single-threaded blocking implementation of `Io`, so you need not be negatively affected by any potential downsides to fibers either way.
This new `Io` approach has made it strictly more likely than it previously was that stackless coroutines become a part of Zig's final design.
Cant wait for 0.15 coming out soon.
(And before anyone mentions it, devirtualization is a myth, sorry)
In Zig it's going to be a language feature, thanks to its single unit compilation model.
> In the less common case when a program instantiates more than one Io implementation, virtual calls done through the Io interface will not be de-virtualized, as that would imply doubling the amount of machine code generated, creating massive code bloat.
From the article
in practice how often are people using more than one io in a program?
Just templating everything doesn’t mean it will be faster every time
pretty sure the intent is for systems that only use one io to have a compiler optimization that elides the cost of double indirection... but also, you're doing IO! so usually something else is the bottleneck, one extra indirection is likely to be peanuts.
This is the key point for me. Regardless of whether you’re under an async event loop, you can specify that the order of your io calls does not imply sequencing. Brilliant. Separate what async means from what the io calls do.
I see that blocking I/O is an option:
> The most basic implementation of `Io` is one that maps to blocking I/O operations.
So far, so good, but blocking I/O is not async.
There is a thread pool that uses blocking I/O. Still good so far, but blocking I/O is still not async.
Then there's green threads:
> This implementation uses `io_uring` on Linux and similar APIs on other OSs for performing I/O combined with a thread pool. The key difference is that in this implementation OS threads will juggle multiple async tasks in the form of green threads.
Okay, they went the Go route on this one. Still (sort of) not async, but there is an important limitation:
> This implementation requires having the ability to perform stack swapping on the target platform, meaning that it will not support WASM, for example.
But still no function colors, right?
Unfortunately not:
> This implementation [stackless coroutines] won’t be available immediately like the previous ones because it depends on reintroducing a special function calling convention and rewriting function bodies into state machines that don’t require an explicit stack to run.
(Emphasis added.)
And the function colors appear again.
Now, to be fair, since there are multiple implementation options, you can avoid function colors, especially since `Io` is a value. But those options are either:
* Use blocking I/O.
* Use threads with blocking I/O.
* Use green threads, which Rust removed [2] for good reasons [3]. It only works in Go because of the garbage collector.
In short, the real options are:
* Block (not async).
* Use green threads (with their problems).
* Function colors.
It doesn't appear that the function colors problem has been defeated. Also, it appears to me that the Zig team decided to have every concurrency technique in the hope that it would appear innovative.
[1]: https://gavinhoward.com/2022/04/i-believe-zig-has-function-c...
[2]: https://github.com/aturon/rfcs/blob/remove-runtime/active/00...
[3]: https://www.open-std.org/JTC1/SC22/WG21/docs/papers/2018/p13...
Overall, I think it's a win. Especially if there is a stdlib implementation that is a no overhead, bogstock, synchronous, blocking io implementation. It follows the "don't pay for things you don't use" attitude of the rest of zig.
n42•5h ago