The cause for an error can be upstream or downstream. If a function fails, because the network is down, then this is a downstream error. The user has not done anything wrong (unless they also are responsible for the network infrastructure). In that case a retry after a few moments might be the right approach. However, if the user provides bad function arguments, then the user needs to be informed, that it's them who need to make corrections. However, it is not always clear if that is the case. If a user requests a non-existing file, then there might be different reasons why the file does not exist (yet).
I like errors that are unique and trivially greppable in a codebase. They should be stack efficient and word sized. Maybe a new calling convention where a register is reserved for error code and another register is a pointer to the source location string that is stored in a data segment.
The FP fanboy side of me likes the idea of algebraic effects and ADTs but not at the expense of stack efficiency.
Just a few minutes ago, while copying 63 GB worth of pics and videos from my phone to my laptop, KDE forwarded me the error "File <hard to retain name.jpg> could not be opened. Retry, Ignore, Ignore all, Cancel".
This was around file 7000 out of 15000. The file transfer stopped until I made a choice.
As a user, what am I supposed to do with such a popup?
It seems like a very good example of "Eror Handling Without Purpose" as the article describes, but at user level.
Except that here, the audience is "a plain user who just dragged a folder to make a copy" and none of the four options (or even the act of stopping the file transfer until an answer is chosen) is actually meaningful for the user.
The "Putting It Together" for this scenario should look like: a non-modal section populates with "file <hard to retain name.jpg> failed due to reason; at the end of the file transfer you'll get a list with all the files that failed, and you'll have an option to retry them, navigate to their source position to double-check, and/or ignore".
I.e. you need to write the report of this to a file itself. In fact you should allocate a decently large file upfront to make sure you can write the report and the error message (out of disk space for example).
You just can't defend against everything, but an imperfect solution can still be an improvement over the status quo.
We have shared workstations for example where this would be a typical use case for non-tecchnical users across multiple user logins: ensuring you can check that the big data transfer was complete a few hours later would be very useful, but if you only do a fraction of the work for completeness then again, it's of no benefit.
KDE even got an entire notifications application, and discovered that it's bad to make them modal. But didn't move away from the idea of dismissing them on any interaction, it still acts like it's a modal.
Of course not.
The litmus test IMO should be "what would a normal intelligent human do in this situation?"
A human would copy every file it could, maintaining a list of issues. When you were available to address concerns, it'd present the options to you. The human would give up if the US Army showed up, but a human would restart a TCP connection automatically without asking for permission again (or more analogously, redial a phone call). A human would save their work automatically, and when you showed back up, would find that work for you.
(In 2026, things like "retry" should be automatic outside some very specific limitations too, because of course a human would try again if they failed).
Problem is that this requires testing what actual "normal intelligent human" would do, because very often programmer has other ideas and UI/UX people have other ideas.
> A human would copy every file it could, maintaining a list of issues.
How do you know? From your idea what should be done instead of current version? I would not do it like you said.
Also, there are many reasons for transfer not succeeding and depending on a reason why transfer didn't succeed, you should make different decisions. sometimes reasons are not predictable by a program (a new file transfer method over pidgeons was transparently added to the system and "carrier attacked by predator" was not included in "how to handle this reason").
Please not, I want my computer to be a dumb tool, who really only does what I told it to. I do not want to have it have it's own agenda.
> In 2026, things like "retry" should be automatic outside some very specific limitations too
No. I can tell the computer to retry, when I didn't it is because I didn't want it to.
A file transfer should remain active even if both devices (source, destination) are physically disconnected, or in network partitions, or when devices are full, need media change, etc.
The only valid states for a file transfer are: ongoing, fully completed with 100% success, or explicitly cancelled by the user with a full usable report of what got copied, fully or partially, and what did not get copied.
The file transfer dialogs and tooling of today's mainstream computing are stuck in the nineties.
Change the floppy disk. In the MSDOS days those messages were useful, as read errors might be caused by having the wrong floppy in the drive. The OS had no way to know when the floppy was changed and "Retry" allowed you to swap the disks back and try again. In modern days it is less useful, the behavior just got carried over.
Windows addresses this issue somewhat by scanning the directory tree before the actual copying starts, this can catch some errors before they happen and gives you better progress reporting on top.
But a single dialog that keeps track of the whole copy/move operations, not a modal dialog attached to individual read/write calls would be the way to go here. This is a case of the GUI sticking to close to what the OS is doing instead of what the user intended to do.
Which really sucks because no you need to wait for minutes before it actually starts moving or deleting. I generally just abort, start the midnight commander or just invoke mv/del directly.
> But a single dialog that keeps track of the whole copy/move operations
Which is what is the case here? The question and buttons appear in that dialog.
The error/retry dialog is for the failure of moving an individual file, not for a failure of the move operation as a whole. Those individual error dialogs provide no means to deal with cascading errors. All you can do is "Skip All", but that means you get no further information on errors anymore.
The error reporting should be part of the Moving dialog itself and provide a list of everything that failed in the move, along with potential ways to resolve it. More detailed reporting than "Could not read" would also be welcome (io, permission, ...).
`thiserror` helps you define the error type. That error type can then be used with `anyhow` or `exn`. Actually, we have been using thiserror + exn for a long time, and it works well. While later we realize that `struct ModuleError(String)` can easily implement Error without thiserror, we remove thiserror dependency for conciseness.
`exn` can use `anyhow::Error` as its inner Error. However, one may use `Exn::as_error` to retrieve the outermost error layer to populate anyhow.
I ever consider `impl std::error::Error` for `exn::Exn,` but it would lose some information, especially if the error has multiple children.
`error-stack` did that at the cost of no more source:
* https://docs.rs/error-stack/0.6.0/src/error_stack/report.rs....
* https://docs.rs/error-stack/0.6.0/src/error_stack/error.rs.h...
This seems akin to complaining that the CPU core has only one instruction pointer. There is nothing preventing a struct implementing `Error` from aggregating other errors (such as validation results) and still exposing them via the `Error` trait. The fact of the matter is that the call stack is linear, so the interior node in the tree the author wants still needs to provide the aggregate error reporting that reflects the call stack that was lost with the various returns. Nothing about that error type implementing `Error` prevents it from also implementing another error reporting trait that reflects the aggregate errors in all of the underlying richness with which they were collected.
Weirdly, the last time I saw an error in production I couldn’t investigate was because of a go service with no error wrapping… funny coincidence
And because it's standardised, it's easy to create tooling to flag mishandled errors.
In fact, the easiest thing to do in Go is to ignore the error; the next easiest is to early-return the same error with no additional context.
Technically speaking, Rust has way better tools for adding context to errors. See for example https://docs.rs/color-eyre/latest/color_eyre/
It does expect you to use `wrap_err` to get the benefits, though. Which is easier to do than what Go requires you to do for good contextual errors, and even easier if you want reasonable-looking formatting from the Go version.
https://github.com/upspin/upspin/blob/master/errors/errors.g...
type Error struct {
// Path is the Upspin path name of the item being accessed.
Path upspin.PathName
// User is the Upspin name of the user attempting the operation.
User upspin.UserName
// Op is the operation being performed, usually the name of the method
// being invoked (Get, Put, etc.). It should not contain an at sign @.
Op Op
// Kind is the class of error, such as permission failure,
// or "Other" if its class is unknown or irrelevant.
Kind Kind
// The underlying error that triggered this one, if any.
Err error
// Stack information; used only when the 'debug' build tag is set.
stack
}"Wherever exceptions are thrown, add as much contextual information to the exceptions as possible. Use class RichException<Exception> to store the extra information". Etc. etc.
See comments like https://github.com/fast/fast.github.io/pull/12#discussion_r2...
Quote my comment in the other thread:
> That said, exn benefits something from anyhow: https://github.com/fast/exn/pull/18, and we feed back our practices to error-stack where we come from: https://github.com/hashintel/hash/issues/667#issuecomment-33...
> While I have my opinions on existing crates, I believe we can share experiences and finally converge on a common good solution, no matter who made it.
- the ? keyword is replaced either by runtime exceptions and so each function do it transpires you don’t catch it, or by simply stating the raised exception in the signature
- message can be overloaded for humans
- the exception type itself is the structured data, but in practice it seldom contains structured data and most logic depends on the exception type.
Make of this what you will, but I didn’t say it’s great.
With Rust, having a generic error bubble up without nesting means you don't even know where it went wrong. The error could be from any generic error source.
Err(report) => {
// For machines: find and handle the structured error
if let Some(err) = find_error::<StorageError>(&report) {
if err.status == ErrorStatus::Temporary {
return queue_for_retry(report);
}
return Err(map_to_http_status(err.kind));
}
They get it right elsewhere when they describe errors for machines as being "flat and actionable." `StorageError` is that, but the outer `Err(report)` is not. You shouldn't be guessing which types of error you might run into; you should be exhaustively enumerating them.I'd rather have something like this:
struct Exn<T> {
trace: Trace,
err: T,
}
impl<T> Exn<T> {
#[track_caller]
fn wrap<U: From<T>>(self, msg: String) -> Exn<U> {
Exn {
trace: self.trace.add_context(Location::caller(), msg),
err: self.err.into(),
}
}
}
That way your `err` field is always a structured error, but you still get a context trace. With a bit more tweaking, you can make the trace tree-shaped rather than linear, too, if you want.I think actionable error types need to be exhaustively matchable, at least for any Rust error that you expect a machine to be handling. Details a human is interested in can be preserved at each layer by the trace, while details the machine cares about will be pruned and reinterpreted at every layer, so the machine-readable info is kept flat, relevant, and matchable.
Traversing though the error tree is the worst case where the structured error has been bubbled up through layers until the one who are able to recover from it.
With regards to context for the programmer, I still think ultimately tracing and color_eyre (see https://docs.rs/color-eyre/latest/color_eyre/) form a good-enough pair for service style applications, with tracing providing the missing additional context. But its nice to see a simpler approach to actionability.
bheadmaster•18h ago
It may be easier to just add the "?" operator everywhere (and we are lazy and will mostly do what is easier), but it often leads to problem explained in the article.
jayknight•18h ago
Doesn't Rust's Result type(s) force you to do the same? Sure, you can pass them on with the ? operator, but it's still a choice you have to make.
alembic_fumes•18h ago
At least with Rust's enums it is possible to make errors automatically actionable. If one skips that part and opts for anyhow because it's too much work, that's really a user problem.
I like the author's idea of "designing" errors by exposing their actionability in the interface a lot. I'm not overall sold on whether that should be the primary categorization, but at least including a docstring to each enum variant about what can be done about the matter sounds like a nice way to improve most code a little bit.
Fizzadar•17h ago
formerly_proven•16h ago
morshu9001•15h ago
tcfhgj•15h ago
morshu9001•14h ago
Typically I'll only have a couple of exception types that my own code throws, like user error vs system error. If I want more detail than that, it goes into the exception payload rather than defining many different types of exceptions.
bheadmaster•15h ago
If a language makes this more convenient than doing it right, one could argue that the language design is at fault.
Thaxll•15h ago
akdor1154•16h ago
As a Go dev, I'm looking at this article with great interest. I would very much like to apply this approach to Go as well, I think the author has got a very strong design there.
tison•12h ago
That said, I can live with "if err != nil", but every type has a zero value is quite a headache to handle: you would fight with nil, typed nil, and zero value.
For example, you need something like:
.. to handle a nullable value while `Valid = false && String = something` is by defined invalid but .. quite hard to explain. (Go has no sum type in this aspect)