Sentinel errors and errors.Is() slow your code down by 500%

GOLANG
16 min read

In this blog post, we benchmark different strategies for handling errors in Go and discuss their relative performance and other tradeoffs. The difference in performance between different strategies was very surprising to us, and we'd like to share the results.

The original publication of this blog overstated the difference in performance between error handling strategies due to poorly configured benchmarks. The author regrets the error.

dolt loves go

In particular, we were shocked to learn that naively using the sentinel error pattern combined with errors.Is() slows your code down by over 5x.

Methodology

I wrote several different fake object stores with similar methods to retrieve an object, with different method signatures and different ways to represent the value being not found. Here's one of them, which follows a common recommendation in the Go community to use a sentinel error to represent the "value not found" condition.

var notFoundErr = errors.New("not found")

type resultType struct {}

type errStore struct {}

//go:noinline
func (b *errStore) GetValue(found bool) (*resultType, error) {
	if found {
		return &resultType{}, nil
	} else {
		return nil, notFoundErr
	}
}

Then I set up a benchmark that calls this function over and over, testing both the case where the value is found and where it's not found, in the same way a client calling this method would need to do.

func BenchmarkNotFoundErrEqual(b *testing.B) {
	var es errStore
	for i := 0; i < b.N; i++ {
		val, err := es.GetValue(i < 0)
		if err == notFoundErr {
			// nothing to do
		} else if err != nil {
			b.Fatal(err)
		}
		if val != nil {
			b.Fatal("expected nil")
		}
	}
}

func BenchmarkFoundErrEqual(b *testing.B) {
	var es errStore
	for i := 0; i < b.N; i++ {
		val, err := es.GetValue(i >= 0)
		if err == notFoundErr {
			b.Fatal("expected found")
		} else if err != nil {
			b.Fatal(err)
		}
		if val == nil {
			b.Fatal("expected not nil")
		}
	}
}

Let's look at the results.

Benchmark results

Here's the raw benchmark output:

$ go test not_found_test.go  -run='.*' -bench=. -count=10 > benchresults.txt
$ benchstat benchresults.txt
goos: linux
goarch: amd64
cpu: AMD EPYC 7571
                              │ benchresults.txt │
                              │      sec/op      │
NotFoundBool-16                      3.423n ± 0%
NotFoundErrorsIs-16                  19.35n ± 0%
NotFoundErrorsIsNilCheck-16          19.34n ± 0%
NotFoundErrEqual-16                  7.366n ± 1%
NotFoundErrEqualNilCheck-16          8.293n ± 0%
NotFoundWrappedErr-16                1.374µ ± 0%
NotFoundWrappedErrNilCheck-16        1.375µ ± 0%
NotFoundWrappedBool-16               11.69n ± 0%
NotFoundPanic-16                     241.5n ± 1%
FoundBool-16                         3.050n ± 2%
FoundErrorsIs-16                     18.04n ± 1%
FoundErrorsIsNilCheck-16             2.939n ± 2%
FoundErrEqual-16                     3.240n ± 2%
FoundErrEqualNilCheck-16             2.994n ± 1%
FoundWrappedErr-16                   23.77n ± 2%
FoundWrappedErrNilCheck-16           9.877n ± 0%
FoundWrappedBool-16                  9.082n ± 0%
FoundPanic-16                        12.77n ± 0%
geomean                              17.22n

We'll look in depth at each of these strategies in a moment. The tables below list the error handling strategies from fastest to slowest. We break the results down into the "found" and "not found" scenarios, since their relative performance is different in each one.

First, here's the speed and relative performance of each strategy when the value is not found:

Strategy Speed (less is better) Multiple of fastest strategy
Bool 3.423 ns/op 1.00
ErrEqual 7.366 ns/op 2.15
ErrEqualNilCheck 8.293 ns/op 2.42
ErrorsIs 19.35 ns/op 5.65
ErrorsIsNilCheck 19.34 ns/op 5.65
Panic 241.5 ns/op 70.55

And here's the same performance data when the value is found:

Strategy Speed (less is better) Multiple of fastest strategy
ErrorsIsNilCheck 2.939 ns/op 1.00
ErrEqualNilCheck 2.994 ns/op 1.02
Bool 3.05 ns/op 1.04
ErrEqual 3.24 ns/op 1.10
Panic 12.77 ns/op 4.35
ErrorsIs 18.04 ns/op 6.14

As you can see, it matters quite a lot for performance how you you design your APIs and handle errors in Go.

Let's examine each strategy, going in order from fastest to slowest.

Bool: don't return an error

The fastest way to handle a "not found" condition in Go is to not represent it as an error. This is labeled as the Bool strategy above. Here's how it's implemented:

type boolStore struct {}

//go:noinline
func (b *boolStore) GetValue(found bool) (*resultType, bool, error) {
	if found {
		return &resultType{}, true, nil
	} else {
		return nil, false, nil
	}
}

As you can see, this method does not return an error when the value isn't found, instead returning a boolean found result. Here's how it's benchmarked:

func BenchmarkNotFoundBool(b *testing.B) {
	var bs boolStore
	for i := 0; i < b.N; i++ {
		val, found, err := bs.GetValue(i < 0)
		if err != nil {
			b.Fatal(err)
		} else if found {
			b.Fatal("expected not found")
		}
		if val != nil {
			b.Fatal("expected nil")
		}
	}
}

func BenchmarkFoundBool(b *testing.B) {
	var bs boolStore
	for i := 0; i < b.N; i++ {
		val, found, err := bs.GetValue(i >= 0)
		if err != nil {
			b.Fatal(err)
		} else if !found {
			b.Fatal("expected found")
		}
		if val == nil {
			b.Fatal("expected not nil")
		}
	}
}

This strategy performs the best in both scenarios, where the value is found and where it's not. Technically speaking this isn't even an "error handling" strategy, since no error is returned. Rather, it's a baseline to serve as a point of comparison for other strategies that do return errors.

ErrEqual: sentinel errors with direct equality check

This strategy uses classic Go sentinel errors to represent the "value not found" condition. It looks like this (duplicated from the "Methodology" section above).

type errStore struct {}

//go:noinline
func (b *errStore) GetValue(found bool) (*resultType, error) {
	if found {
		return &resultType{}, nil
	} else {
		return nil, notFoundErr
	}
}

When we benchmark this strategy, we use direct == comparison with an error constant to detect the not found condition.

func BenchmarkNotFoundErrEqual(b *testing.B) {
	var es errStore
	for i := 0; i < b.N; i++ {
		val, err := es.GetValue(i < 0)
		if err == notFoundErr {
			// nothing to do
		} else if err != nil {
			b.Fatal(err)
		}
		if val != nil {
			b.Fatal("expected nil")
		}
	}
}

func BenchmarkFoundErrEqual(b *testing.B) {
	var es errStore
	for i := 0; i < b.N; i++ {
		val, err := es.GetValue(i >= 0)
		if err == notFoundErr {
			b.Fatal("expected found")
		} else if err != nil {
			b.Fatal(err)
		}
		if val == nil {
			b.Fatal("expected not nil")
		}
	}
}

In the "not found" scenario, this approach is about 2x slower than using boolean existence checks. For the case where the value is found, it performs the same.

Note that checking for sentinel errors with == is no longer recommended for correctness reasons, more details in following sections.

ErrEqualNilCheck: sentinel errors wrapped with a err != nil check

There are two roughly equally idiomatic ways to check for a sentinel error in Go. The first is more common in our experience, and is what's done in the ErrEqual strategy:

		if err == notFoundErr {
			// nothing to do
		} else if err != nil {
			b.Fatal(err)
		}

So you begin by checking for all sentinel error values you want to handle, either in a switch or an if / else chain, and then if the error isn't one of the handled types, pass the error up the stack, panic, or otherwise handle it.

Alternately, you can perform the above logic conditionally after first checking if the error is non-nil. This code is slightly longer and more indented, and is what the ErrEqualNilCheck does.

func BenchmarkNotFoundErrEqualNilCheck(b *testing.B) {
	var es errStore
	for i := 0; i < b.N; i++ {
		val, err := es.GetValue(i < 0)
		if err != nil {
			if err == notFoundErr {
				// nothing to do
			} else {
				b.Fatal(err)
			}
		}
		if val != nil {
			b.Fatal("expected nil")
		}
	}
}

func BenchmarkFoundErrEqualNilCheck(b *testing.B) {
	var es errStore
	for i := 0; i < b.N; i++ {
		val, err := es.GetValue(i >= 0)
		if err != nil {
			if err == notFoundErr {
				b.Fatal("expected found")
			} else {
				b.Fatal(err)
			}
		}
		if val == nil {
			b.Fatal("expected not nil")
		}
	}
}

This strategy is slightly worse than the previous one in the "not found" case because of the added cost of the nil check, weighing in at about 2.5x slower than boolean existence checks. But in the (usually more common) case when the value is found, it's also performs just as well as the boolean strategy.

ErrorsIs: more robust sentinel error detection

The original sentinel error pattern using == caused problems when other parts of a library wanted to wrap the original error in their own error type to add additional information, which causes the equality check to fail. Many linters will warn about using == for sentinel errors, and my JetBrains IDE puts a yellow squiggly under it for the same reason.

Modern best practice is to instead use errors.Is() for sentinel error checks, which correctly detects wrapped errors.

In our benchmarks, this strategy looks like this:

func BenchmarkNotFoundErrorsIs(b *testing.B) {
	var es errStore
	for i := 0; i < b.N; i++ {
		val, err := es.GetValue(i < 0)
		if errors.Is(err, notFoundErr) {
			// nothing to do
		} else if err != nil {
			b.Fatal(err)
		}
		if val != nil {
			b.Fatal("expected nil")
		}
	}
}

func BenchmarkFoundErrorsIs(b *testing.B) {
	var es errStore
	for i := 0; i < b.N; i++ {
		val, err := es.GetValue(i >= 0)
		if errors.Is(err, notFoundErr) {
			b.Fatal("expected found")
		} else if err != nil {
			b.Fatal(err)
		}
		if val == nil {
			b.Fatal("expected not nil")
		}
	}
}

Unfortunately, this strategy has bad performance, at 5.5x slower than the Bool strategy when the value is not found, and 6x slower when it is.

ErrorsIsNilCheck: better performance on the happy path

Just as we saw with ErrEqualNilCheck, we can avoid a large performance penalty in the case that the value is found by only calling errors.Is() after first determining the error is non-nil. Here's the benchmark code:

func BenchmarkNotFoundErrorsIsNilCheck(b *testing.B) {
	var es errStore
	for i := 0; i < b.N; i++ {
		val, err := es.GetValue(i < 0)
		if err != nil {
			if errors.Is(err, notFoundErr) {
				// nothing to do
			} else {
				b.Fatal(err)
			}
		}
		if val != nil {
			b.Fatal("expected nil")
		}
	}
}

func BenchmarkFoundErrorsIsNilCheck(b *testing.B) {
	var es errStore
	for i := 0; i < b.N; i++ {
		val, err := es.GetValue(i >= 0)
		if err != nil {
			if errors.Is(err, notFoundErr) {
				b.Fatal("expected found")
			} else {
				b.Fatal(err)
			}
		}
		if val == nil {
			b.Fatal("expected not nil")
		}
	}
}

Just as with ErrEqualNilCheck, this strategy does very slightly worse for the not found scenario at about 5.5x slower than Bool, but is just as fast as Bool when the value is found.

Panic: the slowest error handling technique

Usually errors in Go are returned as values, but there is also another technique: you can panic instead, then recover that panic at a higher layer of the stack. Most people don't recommend doing this as an error handling technique, and we don't either. But there are some limited cases where it can be more performant than returning errors. That's not the case for this benchmark though. Panic performs the worst of any strategy. Here's what it looks like:

type panicStore struct{}

//go:noinline
func (b *panicStore) GetValue(found bool) *resultType {
	if found {
		return &resultType{}
	} else {
		panic(notFoundErr)
	}
}

func BenchmarkNotFoundPanic(b *testing.B) {
	var es panicStore
	for i := 0; i < b.N; i++ {
		var val *resultType
		func() {
			defer func() {
				// recover panic
				err := recover()
				if err == nil {
					b.Fatal("expected panic")
				}
			}()
			val = es.GetValue(i < 0)
		}()
		if val != nil {
			b.Fatal("expected nil")
		}
	}
}

func BenchmarkFoundPanic(b *testing.B) {
	var es panicStore
	for i := 0; i < b.N; i++ {
		var val *resultType
		func() {
			defer func() {
				// recover panic
				err := recover()
				if err != nil {
					b.Fatal("unexpected panic")
				}
			}()
			val = es.GetValue(i >= 0)
		}()
		if val == nil {
			b.Fatal("expected not nil")
		}
	}
}

This code is 240x slower than boolean return results in the not found case, and 13x slower in the found case. So in addition to being risky (can halt your program if you forget to recover), panics are generally slower than any other error handling strategy.

What about error wrapping?

These are microbenchmarks specifically engineered to zero in on the differences between the error handling strategies. But real application code looks very different. One of the biggest differences is that errors are typically passed through several layers of the stack in between other pieces of business logic. I was curious to know how this difference impacted the benchmarks. Specifically, I wanted to know how the practice of error wrapping impacted the performance of sentinel errors, and how it related to the Boolean return strategy.

So I wrote two additional implementations of object stores that simulate an error deeper in the stack, getting wrapped at every layer. Here's what it looks like:

type wrappedErrStore struct{}

//go:noinline
func (b *wrappedErrStore) GetValue(found bool) (*resultType, error) {
	result, err := b.queryValueStore(found)
	if err != nil {
		return nil, fmt.Errorf("GetValue couldn't get a value: %w", err)
	}
	return result, nil
}

//go:noinline
func (b *wrappedErrStore) queryValueStore(found bool) (*resultType, error) {
	result, err := b.queryDisk(found)
	if err != nil {
		return nil, fmt.Errorf("queryValueStore couldn't get a value: %w", err)
	}
	return result, nil
}

//go:noinline
func (b *wrappedErrStore) queryDisk(found bool) (*resultType, error) {
	result, err := b.readValueFromDiskFake(found)
	if err != nil {
		return nil, fmt.Errorf("queryDisk couldn't get a value: %w", err)
	}
	return result, nil
}

//go:noinline
func (b *wrappedErrStore) readValueFromDiskFake(found bool) (*resultType, error) {
	if found {
		return &resultType{}, nil
	} else {
		return nil, notFoundErr
	}
}

I also have another identical implementation that returns a boolean value to indicate presence or absence of the value, rather than notFoundErr (omitted for brevity), symmetrical to the Bool strategy. When I run the same benchmarks on these two implementations, I get the following results.

When the value is not found:

Strategy Speed (less is better) Multiple of fastest strategy
WrappedBool 11.69 ns/op 1.00
WrappedErr 1374 ns/op 117.54
WrappedErrNilCheck 1375 ns/op 117.62

When the value is found:

Strategy Speed (less is better) Multiple of fastest strategy
WrappedBool 9.082 ns/op 1.00
WrappedErrNilCheck 9.877 ns/op 1.09
WrappedErr 23.77 ns/op 2.62

As you can see, wrapped errors severely degrade the performance of this strategy, making sentinel error checks 120x slower than the boolean check when the error is non-nil. That's bad! Much of the difference in time on this benchmark doesn't come directly from errors.Is() (although it is slower for wrapped errors, that's to be expected). Instead, this difference primarily reflects the fact that creating the wrapped errors is itself expensive. But this is a fair comparison, since in practice your code must also wrap errors to use this strategy.

On the other hand, the difference in strategies is partially smeared by the introduction of more stack layers in the case the object exists, to the point that the nil-guarded errors.Is() check is only 10% slower than boolean checks.

Takeaways and discussion

If you want to reproduce this result or variations of it yourself, you can find the full source here.

So what's the takeaway from all this? There are four main points in my view:

  1. errors.Is() is expensive. If you use it, check the error is non-nil first to avoid a pretty big performance penalty on the happy path.
  2. Using == to check for sentinel errors is likewise expensive, but less so. If you do this, check the error is non-nil first to make it cheaper on the happy path. But because of error wrapping, you probably shouldn't do this at all.
  3. Error wrapping makes using sentinel errors much more expensive, including making errors.Is() more expensive when the error is non-nil.
  4. Using sentinel errors is as performant as other techniques on the happy path if you take the above precautions, but unavoidably much more expensive on the error path.

So the main takeaway is really that sentinel errors can be expensive, and you should consider this when deciding whether to use them.

The standard caveat for performance-based arguments applies here: measure what difference it actually makes in your case, and don't sweat stuff that isn't on the hot path most of the time. We are talking about nanoseconds here, and it takes a lot of those to add up to something noticeable. If sentinel errors are measurably slow for you in practice, it's probably because the ones you're using are some combination of commonly triggered and expensive to construct. YMMV. That said: we have found expensive-to-construct, commonly triggered sentinel errors when profiling our database, and we changed the code to remove them, squeezing several percentage points improvement in the process.

But there are also non-performance reasons you might want to avoid sentinel errors, which are more philosophical or aesthetic in nature.

Earlier this month I kicked the hornet's nest by suggesting that you shouldn't name boolean map check variables ok. It got picked up a few places in the Golang world and generated a lot of discussion on Reddit and elsewhere, which I had expected. But I hadn't expected that so much of the conversation would focus on an almost tangential point I made about API design, specifically about an API returning a NotFound sentinel error:

Separating out an existence check from an error condition is a good thing, actually. You never want to force clients to check for a particular error type in business logic.

...

An error should by default be considered non-recoverable, to be returned when something goes very wrong. Semantically, a table not existing isn't really an error, it's something you expect to happen all the time. An interface that returns an error in the course of normal operation forces clients to understand a lot of details to use the interface correctly (looking at you, io.EOF). And errors can be expensive to construct and examine, so you don't want them constructed or interpreted on your hot path.

I had thought this was a commonly understood nugget of best practice wisdom, but this opinion got a lot of pushback. People said some very hurtful, arguably accurate things about my character. Here's one of the more thoughtful examples from the discussion on Lobste.rs:

lobste.rs discussion

Now, obviously I disagree. And I think these numbers pretty clearly refute the classical view of sentinel error handling in Go, summarized by the comment above:

Errors are normal outcomes of any operation. They aren’t special and aren’t any more or less expensive to deal with than any other type. They are, classically, just values.

The problem is that sentinel errors, as typically and idiomatically used, in fact are special, and are more expensive to deal with than other values. My suggestion to use boolean values outperforms them by a lot, 6x in fairly common idiomatic usage and potentially much more if they're expensive to construct.

And performance considerations aside, sentinel errors have a lot of other issues.

I'm not the first person to note this. Here's Dave Cheney back in 2016 on this topic:

My advice is to avoid using sentinel error values in the code you write. There are a few cases where they are used in the standard library, but this is not a pattern that you should emulate.

Instead, he recommends never inspecting the details of an error:

Now we come to the third category of error handling. In my opinion this is the most flexible error handling strategy as it requires the least coupling between your code and caller.

I call this style opaque error handling, because while you know an error occurred, you don’t have the ability to see inside the error. As the caller, all you know about the result of the operation is that it worked, or it didn’t.

This is all there is to opaque error handling – just return the error without assuming anything about its contents.

Dave didn't invent this advice. Going back further, lots of luminaries and philosophers in the field have given the same advice in other contexts and for other languages: don't use errors for control flow. I'm old enough to have seen Josh Bloch in person at a Java conference, where he was something of a celebrity. He gave this advice on error handling in Effective Java way back in 2001:

Exceptions are, as their name implies, to be used only for exceptional conditions; they should never be used for ordinary control flow.

...

This principle also has implications for API design. A well-designed API must not force its clients to use exceptions for ordinary control flow. A class with a “state-dependent” method that can be invoked only under certain unpredictable conditions should generally have a separate “state-testing” method indicating whether it is appropriate to invoke the state-dependent method.

Now obviously Exceptions in Java are not the same as errors in Go. But just as obviously, they serve the same purpose. Functionally and semantically, there is not much difference between these two code snippets:

try {
    val = store.getValue();
} catch (NotFoundException e) {
    // handle not found
}
val, err := store.GetValue()
if errors.Is(err, notFoundErr) {
    // handle not found
}

They're both workable and idiomatic. But they also both have serious drawbacks, for different but related reasons. And they're both improved by changing the API to eliminate the need for the client to interpret the not found case as an error during normal operation. Here the two languages diverge: because Java lacks multiple return values, it's not ergonomic to add additional metadata like a found boolean into the return result in most cases, so you tend to do something like this instead:

if (store.hasValue()) {
    val = store.getValue();
}

Bloch anticipates that people might be tempted to use exceptions for control flow for performance reasons:

More generally, use standard, easily recognizable idioms in preference to overly clever techniques that purport to offer better performance.

After all, it really is more expensive to call two methods (Has(), Get()) than one (Get()), even with caching. But because Go does have multiple return values, you don't have to make this tradeoff the way you do in Java. You can just return more information and save yourself the extra method call, while not forcing clients to examine the error value.

val, found, err := store.GetValue()
if err != nil {
    return err // opaque error handling
}
if !found {
    // handle not found
}

Some people are bothered by the fact that a method named GetValue() may not succeed in getting a value. If that sounds like you, then name it MaybeGetValue() instead. Other people are bothered by the code overhead of a found boolean in the return. If that sounds like you, you might be able to get away with using a nil return value instead (as long as nil isn't a valid object in your API). Either way, understand that inspecting errors for business logic can come at a substantial performance cost.

Conclusion

We're building Dolt, the world's first version-controlled SQL database. We have been writing it for over five years now, and our codebase has quite a few sentinel errors, many of which we inherited from other libraries, but some of which we wrote ourselves. These days we avoid sentinel errors and generally treat all errors as opaque. We also wrap errors very sparingly, preferring to use stack traces instead where appropriate.

Have questions or comments about Go error handling? Or maybe you are curious about the world's first version-controlled SQL database? Join us on Discord to talk to our engineering team and other Dolt users.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.