Edit: looking more carefully at the lib I assume that ”tag” is the concept that is supposed to cover this?
func (myError) Is(err error) bool
and it can match different sentinel errors.
Or you can make your own wrapper to have the error chain match.> An error is considered to match a target if it is equal to that target or if it implements a method Is(error) bool such that Is(target) returns true.
by default errors.Is matches the exact error variable, but you can use it to match other errors as well.
As antithetical as it might be, I tend to just stuff sentry in (no affiliation just a happy user) when I’m setting up the scaffolding, and insert rich context at the edges (in the router, at a DB/serialization/messagebus layer) and the rest usually just works itself out.
Even for small projects, this is a small thing to introduce but it will pay you dividends in the future. The earlier you start the more you'll thank yourself (it's not very helpful to frantically try and refactor this into a codebase after you've already been bitten!)
An additional thing that is useful here would be a stack trace. So even when you catch, wrap & rethrow the error, you'll be able to see exactly where the error came from. The alternative is searching in the code for the string.
For the hate they seem to get, checked exceptions with error classes do give you a lot of stuff for free.
If you find yourself needing to branch on error classes it may mean error handling is too high up.
ps. personally I always prefer string error codes, ie. "not-found" as opposed to numeric ones ie. 404.
> I always prefer string error codes
My parent company provides an API for us to use for “admin-y” things, but they use stringy errors in the response payload of the body. Except they’re in mandarin, and you’d be surprised (or maybe not) at how many tools barf at it. Getting them to localise the error codes is as likely to happen as us fixing the referer heading. The really nice thing about status codes is that there’s only a fixed amount of them so you don’t get two slightly different responses from two different endpoints (not-found vs not_found), and there’s no locale issues involved.
Error _code_ is code, shouldn't be localized.
Error codes contain only the type of error that occurred and cannot contain any more data. With an error class you can provide context - a 400 happened when making a request, which URL was hit? What did the server say? Which fields in our request were incorrect? From a code perspective, if an error happens I want to know as much detail as possible about it, and that simply cannot be summarised by an error code.
If I want to know the type of an error and do different things based on its type, I can think of no better tool to use than my language's type system handling error classes. I could invent ways to switch on error codes (I hope I'm using a language like Rust that would assert that my handling of the enum of errors is exhaustive), but that doesn't seem very well-founded. For example, using error enums, how do I describe that an HTTP_404 is a type of REQUEST_ERROR, but not a type of NETWORK_CONN_ERROR? It's important to know if the problem is with us or the network. I could write some one-off code to do it, or I could use error classes and have my language's typing system handle the polymorphism for me.
Not that error codes are not useful. You can include an error code within an error class. Error codes are useful for presenting to users so they can reference an operator manual or provide it to customer support. Present the user with a small code that describes the exact scenario instead of an incomprehensible stack trace, and they have a better support experience.
Side note: please don't use strings for things that have discrete values that you switch on. Use enums.
All this flexibility comes for free when your use your language's type system, whereas with plain error codes you would have to implement grouping yourself manually with some kind of lookup table.
The way to do this in a safe and performant manner is to structure the metadata as a tree, with a parent pointing to the previous metadata. You'd probably want to do some pooling and other optimizations to avoid allocating a map every time. Then all the maps can be immutable and therefore not require any locks. To construct the final map at error time, you simply traverse the map depth-first, building a merged map.
I'm not sure I agree with the approach, however. This system will incur a performance and memory penalty every time you descend into a new metadata context, even when no errors are occurring. Building up this contextual data (which presumably already exists on the call stack in the form of local variables) will be constantly going on and causing trouble in hot paths.
A better approach is to return a structured error describing the failed action that includes data known to the returner, which should have enough data to be meaningful. Then, every time you pass an error up the stack, you augment it with additional data so that everything can be gleaned from it. Rather than:
val, err := GetStuff()
if err != nil {
return err
}
You do: val, err := GetStuff()
if err != nil {
return fmt.Errorf("getting stuff: %w")
}
Or maybe: val, err := GetStuff()
if err != nil {
return wrapWithMetadata(err, meta.KV("database", db.Name))
}
Here, wrapWithMetadata() can construct an efficient error value that implements Unwrap().This pays the performance cost only at error time, and the contextual information travels up the stack with a tree of error causes that can be gotten with `errors.Unwrap()`. The point is that Go errors already are a tree of causes.
Sometimes tracking contextual information in a context is useful, of course. But I think the benefit of my approach is that a function returning an error only needs to provide what it knows about the failing error. Any "ambient" contextual information can be added by the caller at no extra cost when following the happy path.
The idea is to only add information that the caller isn't already aware of. Error messages shouldn't include the function name or any of its arguments, because the caller will include those in its own wrapping of that error.
This is done with fmt.Errorf():
userId := "A0101"
err := database.Store(userId);
if err != nil {
return fmt.Errorf("database.Store({userId: %q}): %w", userId, err)
}
If this is done consistently across all layers, and finally logged in the outermost layer, the end result will be nice error messages with all the context needed to understand the exact call chain that failed: fmt.Printf("ERROR %v\n", err)
Output: ERROR app.run(): room.start({name: "participant5"}): UseStorage({type: "sqlite"}): Store({userId: "A0101"}): the transaction was interrupted
This message shows at a quick glance which participant, which database selection, and which integer value where used when the call failed. Much more useful than Stack Traces, which don't show argument values.Of course, longer error messages could be written, but it seems optimal to just convey a minimal expression of what function call and argument was being called when the error happened.
Adding to this, the Go code linter forbids writing error messages that start with Upper Case, precisely because it assumes that all this will be done and error messages are just parts of a longer sentence:
And it's completely useless for looking up the errors linked to a participant in an aggregator, which is pretty much the first issue the article talks about, unless you add an entire parsing and extraction layer overtop.
> Much more useful than Stack Traces, which don't show argument values.
No idea how universal it is, but there are at least some languages where you can get full stackframes out of the stacktrace.
That's how pytest can show the locals of the leaf function out of the box in case of traceback:
def test_a():
> a(0)
test.py:11:
test.py:2: in a
b(f)
test.py:5: in b
c(f, 5/g)
f = 0, x = 5.0
def c(f, x):
> assert f
E assert 0
test.py:8: AssertionError
and can do so in every function of the trace if requested: def test_a():
> a(0)
test.py:11:
f = 0
def a(f):
> b(f)
test.py:2:
f = 0, g = 1
def b(f, g=1):
> c(f, 5/g)
test.py:5:
f = 0, x = 5.0
def c(f, x):
> assert f
E assert 0
test.py:8: AssertionError
So this is just a matter of formatting. return CouldNotStoreUser{
UserID: userId,
}
and now this struct is available to anyone looking up the chain of wrapped errors.> With custom error structs however, it's a lot of writing to create your own error type and thus it becomes more of a burden to encourage your team members to do this.
Because you need a type per layer, and that type needs to implement both error and unwrap.
As a concrete example, it means you can target types with precision in the API layer:
switch e := err.(type) {
case UserNotFound:
writeJSONResponse(w, 404, "User not found")
case interface { Timeout() bool }:
if e.Timeout() {
writeJSONResponse(w, 503, "Timeout")
}
}
I skimmed the article and didn't see the author proposing a way to do that with their arbitrary key/value map.Of course, you could use something else like error codes to translate groups of errors. But then why not just use types?
But as I suggested in my other comment, you could also generalize it. For example:
return meta.Wrap(err, "storing user", "userID", userID)
Here, Wrap() is something like: func Wrap(err error, msg string, kvs ...any) {
return &KV{
KV: kvs,
cause: err,
msg: msg,
}
}
This is the inverse of the context solution. The point is to provide data at the point of error, not at every call site.You can always merge these later into a single map and pay the allocation cost there:
var fields map[string]any
for err != nil {
if e, ok := err.(*KV); ok {
for i := 0; i < len(KV.KV); i += 2 {
fields[e.KV[i].(string)] = e.KV[i+1]
}
}
err = errors.Unwrap(err)
}Horses for courses.
for err != nil {
switch e := err.(type) {
case UserNotFound:
writeJSONResponse(w, 404, "User not found")
return
case interface { Timeout() bool }:
if e.Timeout() {
writeJSONResponse(w, 503, "Timeout")
return
}
}
err = errors.Unwrap(err)
}> it means you can target types with precision in the API layer
The only situation where you need to get precise error types is when you need to provide specific details from those specific types to the consumer, which is rare. And even in those rare cases, user code does that work via errors.As, not this manual Unwrap loop process you're suggesting here.
The documentation is clear that comparing an error value or casting it without following the Unwrap() chain is only an antipattern because it would not work with wrapped errors.
Is() and As() are merely convenience functions, and the documentation is clear that all they're doing is calling Unwrap(), which you can do yourself.
It's not rare in my experience. All they apps I work on have a central unhandled error handler in the API that converts Go errors to HTTP or gRPC error responses, and then falls back to a general "internal error" if no specific error could be mapped. I can think of many other instances where we have a switch over half a dozen error typed in order to translate them into other types across RPC or pub/sub boundaries.
> And even in those rare cases, user code does that work via errors.As, not this manual Unwrap loop process you're suggesting here.
As() does not work with switch statements unless you pre-declare a ton (in our case, often a couple of dozen) error variables. Secondly, it is deeply inefficient. As() traverses the cause tree recursively for every single error, so if you have 30 possible error types to compare, and an error typically wraps 3 layers deep, that's a worst case of 30 loop iterations with 90 cases, as opposed to my method, which is 3 loops.
I have no idea how you came to this conclusion. It's certainly not what happens when you call errors.As in your application code.
There's no situation where your application code would ever have 30 error types to compare against, if that were ever the case you have seriously fucked up!
var a, b, c error1, error2, error3
switch {
case errors.As(&a):
...
case errors.As(&b):
...
case errors.As(&c):
...
}
…then yes, you will be doing 3 searches, each of which will do a loop (sometimes recursively if Unwrap() returns []error) over the chain of causes.> There's no situation where your application code would ever have 30 error types to compare against, if that were ever the case you have seriously fucked up!
That is your opinion. In my experience, that is not the case, because there are lots of cases where you want to centrally translate a canonical set of errors into another canonical set.
> It literally is not. Is and As are not merely convenience functions, they're canonical …
This is just your opinion. If you actually read the documentation, you will see that it merely says Is() and As() are "preferable" to checking.
As an example, say we have an API implemented on top of a complex data store. Every data store implementation can return errors like ObjectNotFound, InsufficientPermissions, and a dozen others. Every data store call can potentially return these. As well as, of course, standard Go errors like DeadlineExceeded or internal errors that cannot be exposed as user-facing API responses. However, some translation error has to translate those errors into API responses.
This cannot conveniently and consistently be done in each API handler, as it would repeat the same error translation for the same errors. An InsufficientPermissions error may happen in a "create" route as well as an in a "update" route, but also in any other route that deals with objects not being accessible.
Therefore it must be done in a central error translator. By definition. And this translation must either do a dozen+ Is() and As() calls, or it can be done efficiently, as I've described.
Anyway, I've said all I have needed to say and won't respond any further.
These claims are, bluntly, incorrect. There are no widely-used modules that work this way, and there are no properties of the language or its conventions that would suggest that this is a viable way to design an API. errors.Is and errors.As provide capabilities that type assertions -- as you've described -- factually do not provide. They're not equivalent, they're not normally used, they're not anything other than red flags in bad code that should be eliminated.
I'm not trying to pick a fight with you, I'm honestly just trying to prevent other people, reading this comment thread, from making the kinds of design mistakes that you're describing here as viable and efficient. They truly aren't.
I've implemented something similar in my errors library relying on log/slog.Attr.
I wrote my Go version of this same error wrapping utility for the same reasons: https://github.com/sethgrid/kverr
My current work uses python and I am hoping to change us over to structured logs that play well with structured exceptions.
The approach I ended up taking is to use slog attributes. It allows for reuse of existing logging attributes.
This is explained here (skip to the “adding metadata” portion). https://blog.gregweber.info/blog/go-errors-library/
Go package: https://pkg.go.dev/github.com/gregwebs/errors/slogerr
This approach has been successful in major projects at work and on the side for several years now.
Example:
err := oops. Code("iam_missing_permission"). In("authz"). Tags("authz"). Time(time.Now()). With("user_id", 1234). With("permission", "post.create"). Hint("Runbook: https://doc.acme.org/doc/abcd.md"). User("user-123", "firstname", "john", "lastname", "doe"). Errorf("permission denied")
For easier debugging, the error contains the full stacktrace.
altbdoor•8mo ago
pjmlp•8mo ago
candiddevmike•8mo ago
Doesn't this contradict the first part of your post? Kubernetes for instance was ported from Java to Go (albeit, poorly). Is Java worse than Go?
pjmlp•8mo ago
Also it is quite ironic how given the Java bashing on Go community, there was so little learned from Java evolution and design mistakes.
They even ended up having to reach for the same folks that helped designing Java generics.
As for your remark, the actual rewrite history as told at FOSDEM, is that the rewrite only happened as two strong minded Go devs joined the Kubernetes team and heavily pushed for the rewrite.
jamesrr39•8mo ago
9rx•8mo ago
To be fair, it was made abundantly clear when it was first released unto the world that it was intended to feel like a dynamically-typed language, but with performance characteristics closer to statically-typed languages. What little type system it has is there merely to support the performance goals. If they had figured out how to deliver on the performance end as a strictly dynamically-typed language, it is likely it would have gone without a static type system entirely.
Call it distain if you will, but it is not like there weren't already a million other languages with modern, advanced type systems. That market was already, and continues to be, flooded with many lovely languages to choose from. Go becoming yet another just like all the rest would have been rather pointless. "Like Python, but faster" was the untapped market at the time – and serving that market is why it is now a household name instead of being added to the long list of obscure languages that, while technically interesting, all do the same thing.
gregors•8mo ago
Concurrency is also something they got more or less right. The most important thing is that they invented their (Google) language that they could exert complete control over. From a business perspective specifically, that was much better than Java.
TheDong•8mo ago
Walking on burning coals is a step up if you're coming from C.
We shouldn't grade languages on that much of a curve by comparing them to garbage.
> Concurrency is also something they got more or less right
Except data-races are an incredibly common bug in Go, and an incredibly rare bug in Rust.
Data-races are way more common in Go than in modern C++ or Java, if only because mutex semantics in Go are awful, with many programmers opting for manual "lock" and "unlock" calls, and mentally reasoning about the correct scope for the critical section with no compiler assistance.
I will give you that they made concurrency very easy.