Edit: looking more carefully at the lib I assume that ”tag” is the concept that is supposed to cover this?
func (myError) Is(err error) bool
and it can match different sentinel errors.
Or you can make your own wrapper to have the error chain match.> An error is considered to match a target if it is equal to that target or if it implements a method Is(error) bool such that Is(target) returns true.
by default errors.Is matches the exact error variable, but you can use it to match other errors as well.
As antithetical as it might be, I tend to just stuff sentry in (no affiliation just a happy user) when I’m setting up the scaffolding, and insert rich context at the edges (in the router, at a DB/serialization/messagebus layer) and the rest usually just works itself out.
Even for small projects, this is a small thing to introduce but it will pay you dividends in the future. The earlier you start the more you'll thank yourself (it's not very helpful to frantically try and refactor this into a codebase after you've already been bitten!)
An additional thing that is useful here would be a stack trace. So even when you catch, wrap & rethrow the error, you'll be able to see exactly where the error came from. The alternative is searching in the code for the string.
For the hate they seem to get, checked exceptions with error classes do give you a lot of stuff for free.
If you find yourself needing to branch on error classes it may mean error handling is too high up.
ps. personally I always prefer string error codes, ie. "not-found" as opposed to numeric ones ie. 404.
> I always prefer string error codes
My parent company provides an API for us to use for “admin-y” things, but they use stringy errors in the response payload of the body. Except they’re in mandarin, and you’d be surprised (or maybe not) at how many tools barf at it. Getting them to localise the error codes is as likely to happen as us fixing the referer heading. The really nice thing about status codes is that there’s only a fixed amount of them so you don’t get two slightly different responses from two different endpoints (not-found vs not_found), and there’s no locale issues involved.
Error _code_ is code, shouldn't be localized.
Error codes contain only the type of error that occurred and cannot contain any more data. With an error class you can provide context - a 400 happened when making a request, which URL was hit? What did the server say? Which fields in our request were incorrect? From a code perspective, if an error happens I want to know as much detail as possible about it, and that simply cannot be summarised by an error code.
If I want to know the type of an error and do different things based on its type, I can think of no better tool to use than my language's type system handling error classes. I could invent ways to switch on error codes (I hope I'm using a language like Rust that would assert that my handling of the enum of errors is exhaustive), but that doesn't seem very well-founded. For example, using error enums, how do I describe that an HTTP_404 is a type of REQUEST_ERROR, but not a type of NETWORK_CONN_ERROR? It's important to know if the problem is with us or the network. I could write some one-off code to do it, or I could use error classes and have my language's typing system handle the polymorphism for me.
Not that error codes are not useful. You can include an error code within an error class. Error codes are useful for presenting to users so they can reference an operator manual or provide it to customer support. Present the user with a small code that describes the exact scenario instead of an incomprehensible stack trace, and they have a better support experience.
Side note: please don't use strings for things that have discrete values that you switch on. Use enums.
The way to do this in a safe and performant manner is to structure the metadata as a tree, with a parent pointing to the previous metadata. You'd probably want to do some pooling and other optimizations to avoid allocating a map every time. Then all the maps can be immutable and therefore not require any locks. To construct the final map at error time, you simply traverse the map depth-first, building a merged map.
I'm not sure I agree with the approach, however. This system will incur a performance and memory penalty every time you descend into a new metadata context, even when no errors are occurring. Building up this contextual data (which presumably already exists on the call stack in the form of local variables) will be constantly going on and causing trouble in hot paths.
A better approach is to return a structured error describing the failed action that includes data known to the returner, which should have enough data to be meaningful. Then, every time you pass an error up the stack, you augment it with additional data so that everything can be gleaned from it. Rather than:
val, err := GetStuff()
if err != nil {
return err
}
You do: val, err := GetStuff()
if err != nil {
return fmt.Errorf("getting stuff: %w")
}
Or maybe: val, err := GetStuff()
if err != nil {
return wrapWithMetadata(err, meta.KV("database", db.Name))
}
Here, wrapWithMetadata() can construct an efficient error value that implements Unwrap().This pays the performance cost only at error time, and the contextual information travels up the stack with a tree of error causes that can be gotten with `errors.Unwrap()`. The point is that Go errors already are a tree of causes.
Sometimes tracking contextual information in a context is useful, of course. But I think the benefit of my approach is that a function returning an error only needs to provide what it knows about the failing error. Any "ambient" contextual information can be added by the caller at no extra cost when following the happy path.
The idea is to only add information that the caller isn't already aware of. Error messages shouldn't include the function name or any of its arguments, because the caller will include those in its own wrapping of that error.
This is done with fmt.Errorf():
userId := "A0101"
err := database.Store(userId);
if err != nil {
return fmt.Errorf("database.Store({userId: %q}): %w", userId, err)
}
If this is done consistently across all layers, and finally logged in the outermost layer, the end result will be nice error messages with all the context needed to understand the exact call chain that failed: fmt.Printf("ERROR %v\n", err)
Output: ERROR app.run(): room.start({name: "participant5"}): UseStorage({type: "sqlite"}): Store({userId: "A0101"}): the transaction was interrupted
This message shows at a quick glance which participant, which database selection, and which integer value where used when the call failed. Much more useful than Stack Traces, which don't show argument values.Of course, longer error messages could be written, but it seems optimal to just convey a minimal expression of what function call and argument was being called when the error happened.
Adding to this, the Go code linter forbids writing error messages that start with Upper Case, precisely because it assumes that all this will be done and error messages are just parts of a longer sentence:
And it's completely useless for looking up the errors linked to a participant in an aggregator, which is pretty much the first issue the article talks about, unless you add an entire parsing and extraction layer overtop.
> Much more useful than Stack Traces, which don't show argument values.
No idea how universal it is, but there are at least some languages where you can get full stackframes out of the stacktrace.
That's how pytest can show the locals of the leaf function out of the box in case of traceback:
def test_a():
> a(0)
test.py:11:
test.py:2: in a
b(f)
test.py:5: in b
c(f, 5/g)
f = 0, x = 5.0
def c(f, x):
> assert f
E assert 0
test.py:8: AssertionError
and can do so in every function of the trace if requested: def test_a():
> a(0)
test.py:11:
f = 0
def a(f):
> b(f)
test.py:2:
f = 0, g = 1
def b(f, g=1):
> c(f, 5/g)
test.py:5:
f = 0, x = 5.0
def c(f, x):
> assert f
E assert 0
test.py:8: AssertionError
So this is just a matter of formatting. return CouldNotStoreUser{
UserID: userId,
}
and now this struct is available to anyone looking up the chain of wrapped errors.> With custom error structs however, it's a lot of writing to create your own error type and thus it becomes more of a burden to encourage your team members to do this.
Because you need a type per layer, and that type needs to implement both error and unwrap.
As a concrete example, it means you can target types with precision in the API layer:
switch e := err.(type) {
case UserNotFound:
writeJSONResponse(w, 404, "User not found")
case interface { Timeout() bool }:
if e.Timeout() {
writeJSONResponse(w, 503, "Timeout")
}
}
I skimmed the article and didn't see the author proposing a way to do that with their arbitrary key/value map.Of course, you could use something else like error codes to translate groups of errors. But then why not just use types?
But as I suggested in my other comment, you could also generalize it. For example:
return meta.Wrap(err, "storing user", "userID", userID)
Here, Wrap() is something like: func Wrap(err error, msg string, kvs ...any) {
return &KV{
KV: kvs,
cause: err,
msg: msg,
}
}
This is the inverse of the context solution. The point is to provide data at the point of error, not at every call site.You can always merge these later into a single map and pay the allocation cost there:
var fields map[string]any
for err != nil {
if e, ok := err.(*KV); ok {
for i := 0; i < len(KV.KV); i += 2 {
fields[e.KV[i].(string)] = e.KV[i+1]
}
}
err = errors.Unwrap(err)
}
Horses for courses.
for err != nil {
switch e := err.(type) {
case UserNotFound:
writeJSONResponse(w, 404, "User not found")
return
case interface { Timeout() bool }:
if e.Timeout() {
writeJSONResponse(w, 503, "Timeout")
return
}
}
err = errors.Unwrap(err)
}
I wrote my Go version of this same error wrapping utility for the same reasons: https://github.com/sethgrid/kverr
My current work uses python and I am hoping to change us over to structured logs that play well with structured exceptions.
The approach I ended up taking is to use slog attributes. It allows for reuse of existing logging attributes.
This is explained here (skip to the “adding metadata” portion). https://blog.gregweber.info/blog/go-errors-library/
Go package: https://pkg.go.dev/github.com/gregwebs/errors/slogerr
This approach has been successful in major projects at work and on the side for several years now.
Example:
err := oops. Code("iam_missing_permission"). In("authz"). Tags("authz"). Time(time.Now()). With("user_id", 1234). With("permission", "post.create"). Hint("Runbook: https://doc.acme.org/doc/abcd.md"). User("user-123", "firstname", "john", "lastname", "doe"). Errorf("permission denied")
For easier debugging, the error contains the full stacktrace.
altbdoor•1d ago
pjmlp•1d ago
candiddevmike•1d ago
Doesn't this contradict the first part of your post? Kubernetes for instance was ported from Java to Go (albeit, poorly). Is Java worse than Go?
pjmlp•1d ago
Also it is quite ironic how given the Java bashing on Go community, there was so little learned from Java evolution and design mistakes.
They even ended up having to reach for the same folks that helped designing Java generics.
As for your remark, the actual rewrite history as told at FOSDEM, is that the rewrite only happened as two strong minded Go devs joined the Kubernetes team and heavily pushed for the rewrite.
jamesrr39•23h ago
9rx•15h ago
To be fair, it was made abundantly clear when it was first released unto the world that it was intended to feel like a dynamically-typed language, but with performance characteristics closer to statically-typed languages. What little type system it has is there merely to support the performance goals. If they had figured out how to deliver on the performance end as a strictly dynamically-typed language, it is likely it would have gone without a static type system entirely.
Call it distain if you will, but it is not like there weren't already a million other languages with modern, advanced type systems. That market was already, and continues to be, flooded with many lovely languages to choose from. Go becoming yet another just like all the rest would have been rather pointless. "Like Python, but faster" was the untapped market at the time – and serving that market is why it is now a household name instead of being added to the long list of obscure languages that, while technically interesting, all do the same thing.
gregors•19h ago
Concurrency is also something they got more or less right. The most important thing is that they invented their (Google) language that they could exert complete control over. From a business perspective specifically, that was much better than Java.
TheDong•15h ago
Walking on burning coals is a step up if you're coming from C.
We shouldn't grade languages on that much of a curve by comparing them to garbage.
> Concurrency is also something they got more or less right
Except data-races are an incredibly common bug in Go, and an incredibly rare bug in Rust.
Data-races are way more common in Go than in modern C++ or Java, if only because mutex semantics in Go are awful, with many programmers opting for manual "lock" and "unlock" calls, and mentally reasoning about the correct scope for the critical section with no compiler assistance.
I will give you that they made concurrency very easy.