Go range iterators demystified

July 12, 2024

8 min read

Introduction

We're using Go to write Dolt, the world's first version-controlled SQL database. Like most large Go codebases, we have a lot of collection types that we iterate over. New in Go 1.23, you can now use the range keyword to iterate over custom collection types.

How does that work? Is it a good idea? Let's dive in.

If you want to run any of the code in this tutorial, you'll need to install the Go 1.23 release candidate, or run Go 1.22 with this in your environment:

export GOEXPERIMENT=rangefunc

What are range iterators?

I don't know about you, but I found the explanation and examples from the experiment documentation very confusing. But the release notes for 1.23 do a much better job summarizing the feature (although not of providing examples).

Range iterators are function types you can use with the built-in range keyword starting in Go 1.23. From the release notes:

The “range” clause in a “for-range” loop now accepts iterator functions of the following types:

func(func() bool)
func(func(K) bool)
func(func(K, V) bool)

These correpond to the 3 kinds of simple loops you can write with range (ignoring channels):

// New in Go 1.22, equivalent to a for loop counting 0..9
for i := range 10 { ... }

// Just the indexes (or just the keys for a map)
for i := range mySlice { ... }

// Indexes and values (or keys and values for a map)
for i, s := range mySlice { ... }

Prior to Go 1.23, the range keyword only worked with slices or maps (let's keep ignoring channels). Now it works with these special function types as well.

Here's a simple example:

func iter1(yield func(i int) bool) {
	for i := range 3 {
		if !yield(i) {
			return
		}
	}
}

iter1 is a range iterator that will run three times. You use it like this:

func testFuncRange1() {
	for i := range iter1 {
		fmt.Println("iter1", i)
	}
}

When you run this code, it prints the following:

iter1 0
iter1 1
iter1 2

Types of range iterators

There are three types of range iterator functions, one for each form of the range loop. So they take 0, 1, or 2 arguments. The one above takes 1 argument. Here's one that takes 0 arguments:

func iter0(yield func() bool) {
	for range 3 {
		if !yield() {
			return
		}
	}
}

func testFuncRange0() {
	for range iter0 {
		fmt.Println("iter0")
	}
}

If you run this, it prints:

iter0
iter0
iter0

Here's an example that takes 2 arguments:

func iter2(yield func(int, int) bool) {
	for i := range 3 {
		if !yield(i, i+1) {
			return
		}
	}
}

func testFuncRange2() {
	for i, e := range iter2 {
		fmt.Println("iter2", i, e)
	}
}

If you run this, it prints:

iter2 0 1
iter2 1 2
iter2 2 3

What does `yield()` do?

The yield function accepted by a range iterator is what invokes the body of the loop. When you write a range loop with an iterator, the Go compiler converts it into function calls for you. So this code:

func testFuncRange2() {
	for i, e := range iter2 {
		fmt.Println("iter2", i, e)
	}
}

Gets implicitly converted by the compiler to something like this code:

func testFuncRange2() {
    iter2(func(i int, e int) {
		fmt.Println("iter2", i, e)
        return true
	})
}

When you call yield() in your range iter function, that's you invoking the body of the loop. When you check the return value of yield, that's you checking to see if the loop should continue or not -- there might have been a break or return statement.

What if you don't check the result of yield()? Then your program will panic in the result there is a break:

func brokenIter(yield func(i int) bool) {
	for i := range 3 {
		yield(i+1)
	}
}

func testBrokenIter() {
	for i := range brokenIter {
		fmt.Println("brokenIter", i)
		if i > 1 {
			break
		}
	}
}

When you run this, you get this panic:

brokenIter 1
brokenIter 2
panic: runtime error: range function continued iteration after function for loop body returned false

What if you want to not call yield() for every element, or call it more than once, or call it with different arguments? Well, you can, nothing stops you, and this leads to some interesting use cases.

Use cases

So why would would you use a range iterator? Basically: so that you can use the range keyword to iterate over a collection that isn't a map or a slice. If you don't want to do that, there's no reason to use them.

That said, there are some interesting things you can use them for. Let's look at a couple.

For all these examples, we'll define a basic type alias just so that we can define methods on it. Since the main use case of range iterators is custom collection types, we expect you'll mostly see them as methods invoked on a collection object.

type Slice []int

For all of our examples, we'll make sure to conditionally break iteration to test that our iterators handle that correctly.

Using the `range` keyword with a custom collection

Maybe you just want to use the range keyword to iterate over every element of your collection. Easy enough.

func (s Slice) All() func(yield func(i int) bool) {
	return func(yield func(i int) bool) {
		for i := range s {
			if !yield(s[i]) {
				return
			}
		}
	}
}

Call it like this:

func iterAll(slice Slice) {
	for i := range slice.All() {
		fmt.Println("all iter:", i)
		if i > 10 {
			break
		}
	}
}

Yes, this is not strictly necessary in the case of our Slice since it's an alias for []int. But it demonstrates what's possible for all collection types.

Filtering values

I have a collection of elements, and I want to iterate over just the ones that match certain criteria. Let's write a range iterator that iterates over only the prime numbers in a collection:

func (s Slice) Primes() func (yield func(i int) bool) {
	return func (yield func(i int) bool) {
		for i := range s {
			if big.NewInt(int64(s[i])).ProbablyPrime(0) {
				if !yield(s[i]) {
					return
				}
			}
		}
	}
}

I can call it like this:

func iterPrimes(slice Slice) {
	for i := range slice.Primes() {
		fmt.Println("prime number:", i)
		if i > 10 {
			break
		}
	}
}

We can generalize this filtering approach by making a method that accepts a predicate function:

func (s Slice) FilteredIter(predicate func(i int) bool) func (yield func(i int) bool) {
	return func (yield func(i int) bool) {
		for i := range s {
			if predicate(s[i]) {
				if !yield(s[i]) {
					return
				}
			}
		}
	}
}

Now I can call this iterator with any predicate I want to select which elements in the collection to iterate over. Here's one that iterates over only even numbers.

func iterEvens(slice Slice) {
	for i := range slice.FilteredIter(func(i int) bool {
		return i%2 == 0
	}) {
		fmt.Println("even number:", i)
		if i > 10 {
			break
		}
	}
}

Iterating while handling errors

For some collections, during iteration you might need to perform I/O or some other operation that could fail in order to produce the next element. Range iterators give a very succinct way to express this. Just define your iterator like this:

func (s Slice) ErrorIter() func(yield func(i int, e error) bool) {
	return func(yield func(i int, e error) bool) {
		for _, i := range s {
            // If there's an error getting the next element,
            // pass it into the yield function as the second parameter
			if !yield(i, nil) {
				return
			}
		}
	}
}

This uses the 2-parameter variation of a range loop, which in the case of a slice or a map returns the index and element or the key and value, respectively. We're slightly abusing that convention so that our iterator returns either the element if it can be gotten, or an error.

func iterWithErr(slice Slice) error {
	for i, err := range slice.ErrorIter() {
		if err != nil {
			return err
		}
		fmt.Printf("error iter got value: %d\n", i)
		if i > 10 {
			break
		}
	}
	return nil
}

Maybe this makes you feel uncomfortable, which is understandable. It may not be what the Go authors had in mind for these iterators, because it breaks the idiom of what the values in range mean. But because it's both easy to do and useful, I predict this use case will emerge as a common pattern for custom collection types.

If your iteration can return an error but you don't like the 2-parameter version, you can always create a little wrapper struct for your results like so:

type IterResult struct {
    Result int
    Err error
}

Handling sentinel errors

One iteration pattern we've seen a few places is for traditional iterators to use a sentinel error, often io.EOF, to signal there are no more values. So you have an iteration loop that looks like this:

for {
    val, err := iter.Next()
    if err == io.EOF {
        break
    } else if err != nil {
        return nil, err
    }

    // business logic here
}

Wouldn't it be nice if we didn't have to handle end-of-loop control flow by checking for a specific sentinel error? Well, with range iters you can convert any iterator to do just that.

type iterEof struct {
	slice Slice
	i int
}

func (iter *iterEof) Next() (int, error) {
	defer func() {
		iter.i++
	}()

	if iter.i >= len(iter.slice) {
		return 0, io.EOF
	} else if rand.Float32() > .9 {
		return 0, fmt.Errorf("failed to fetch next element")
	}

	return iter.slice[iter.i], nil
}

func(s Slice) Iter() *iterEof {
	return &iterEof{slice: s}
}

func (s Slice) RangeCompatibleIter() func (yield func(int, error) bool) {
	iter := s.Iter()
	return func (yield func(i int, e error) bool) {
		for {
			next, err := iter.Next()
			if err == io.EOF {
				return
			}

			if !yield(next, err) {
				return
			}
		}
	}
}

This range iterator wraps the underlying, traditional iterator and handles the io.EOF control flow for us. Now when we use it, any error we get is a real error, not a sentinel value.

func iterTraditionalWithRange(slice Slice) {
	for i, err := range slice.RangeCompatibleIter() {
		if err != nil {
			fmt.Printf("error: %s\n", err.Error())
		}

		fmt.Println("iter got value: ", i)
	}
}

A small quality of life improvement, but maybe one you care about. You also can use the same strategy, minus the error handling, to convert any traditional iterator object to one compatible with the range keyword using this technique.

`Pull` and `Pull2`

The Go authors also included two convenience functions, Pull and Pull2, that transform range iterator functions into traditional iterators. This is for people working with libraries that use range iterators, but who don't want to use the range keyword, and would rather iterate with a Next() method instead. Here's how they work.

func iterTraditionalWithRangeRoundTrip(slice Slice) {
	next, stop := iter.Pull2(slice.RangeCompatibleIter())
	defer stop()
	i := 0
	for {
		result, err, valid := next()
		if !valid {
			break
		}

		if i > 10 {
			break
		}
		i++

		if err != nil {
			fmt.Printf("error: %s\n", err.Error())
		} else {
			fmt.Println("iter got value: ", result)
		}
	}
}

So the Pull2 function returns an iterator function, next. When you call it, you get back the 2 result params from the range iterator (only 1 result for the Pull function), as well as a boolean telling you whether iteration is done. You must dispose of the iterator with the provided stop function.

What we're doing above is deeply silly: we have a traditional iterator with a Next() method that we're wrapping to turn into a range iterator, then using the Pull2 function to turn it back into a traditional iterator. But it all works, and demonstrates that these techniques can be composed in arbitrary ways. Fun!

Readability and convention considerations

The Go convention for range is that the one-var version returns an index or a key, and the two-var version returns those plus their values. If you want only the values, not the keys or indexes, you do this:

for _, val := range slice { ... }

With range iterators, we're free to write iterator functions that break this convention. So in some of the above examples, we have a single var returned, and it's the value, not the index.

func iterPrimes(slice Slice) {
	for i := range slice.Primes() {
		fmt.Println("prime number:", i)
		if i > 10 {
			break
		}
	}
}

Is this a good thing? Is it confusing that the single-var version of the range keyword might work completely differently for range iterators than maps and slices? Well, this feature is still new, and the conventions for its use don't really exist yet, so I'll leave it to the community to fight over. However, I will point out that in a majority of the cases when I use the range keyword with a slice, I'm only interested in the value, not the index. And for some collection types, there isn't a meaningful index or key to return anyway. I'm sure we'll figure it out over time.

Conclusion

You can find all the examples above, plus some I cut for space, here.

Have questions about Go range iterators? Or maybe you are curious about the world's first version-controlled SQL database? Join us on Discord to talk to our engineering team and other Dolt users.

Blog