Writing generic collection types in Go: the missing documentation

GOLANG
12 min read

Introduction

Go generics were released in Go 1.18, over two years ago. We're using Go to write Dolt, the world's first version-controlled SQL database, and while we have hundreds of thousands of lines of Go code, we haven't used generics very much. There are a couple places we use them to make high-traffic parts of our code faster, but for the most part, we haven't really found a good reason for them, outside of the useful library methods in the slices and maps packages.

But recently, I was faced with a problem where writing a generic collection seemed like a good solution. Only one problem: I couldn't get it to work. I spent hours spinning my wheels on solutions that didn't quite work. Only after an embarrassing amount of trial and error, searching, and reading did I finally figure out the right way to implement a generic collection in Go.

Because the search results and documentation for this important use case are so poor, I imagine that many people attempting this made the same mistakes and encountered the same errors I did. So I'm going to present this in the order I tried things: two false starts and their symptoms, then the actual solution. You can skip to the the final solution and takeaways if that's what you're looking for.

What we want to build: a sortable Set of any type

I want to make a generic Set collection that can also return its contents in a sorted order. I want the type safety features of generics, which means no type assertions should be necessary to write it.

Let's start by defining a generic interface that each set member must implement, and one concrete implementation of a set member.

type Sortable[T comparable] interface {
	Less(member T) bool
}

type Name struct {
	First string
	Last string
}

func (n Name) Less(member Name) bool {
	return n.First < member.First || n.First == member.First && n.Last < member.Last
}

var _ Sortable[Name] = Name{}

So far, so good. We have an interface that will let us sort the contents of our collection for any arbitrary type that implements a Less() method. And the Less method takes the parameterized type, not a Sortable interface pointer that we would need to do a type assertion on to do the actual comparisons with.

All this looks great. Now let's define a Set interface that uses the Sortable interface as its member elements.

False start #1: Use two generic type parameters, copying slices.Index

My first attempt to define the generic collection interface was this:

type SortableSet[T Sortable] interface {
	Add(member T)
	Size() int
	Contains(member T) bool
	Sorted() []T
}

Unfortunately, it doesn't compile, because the Sortable interface is itself generic. It needs its own type parameter.

cannot use generic type Sortable[T comparable] without instantiation

So let's add a second type parameter like this:

type SortableSet[T Sortable[E], E comparable] interface {
	Add(member T)
	Size() int
	Contains(member T) bool
	Sorted() []T
}

This looks a little weird, but it's what the slices package does, like the slices.Index function below:

func Index[S ~[]E, E comparable](s S, v E) int {
    for i := range s {
        if v == s[i] {
            return i
        }
    }
    return -1
}

Now let's try to implement our new interface. We'll use a map for storage since that gives us set semantics for free. My first attempt looks like this.

type MapSet[T Sortable[E], E comparable] struct {
	members map[T]struct{}
}

Looks good, doesn't compile:

invalid map key type T (missing comparable constraint)

Ok, since the map key needs to be comparable, I guess we need to use the comparable E type for it, rather than the T type.

type MapSet[T Sortable[E], E comparable] struct {
	members map[E]struct{}
}

func NewMapSet[T Sortable[E], E comparable]() SortableSet[T, E] {
	return MapSet[T, E]{
		members: make(map[E]struct{}),
	}
}

That compiles, great! Let's flesh out the rest of the methods. Oh no, it's full of errors.

func (s MapSet[T, E]) Add(member T) {
    // error: cannot use member (variable of type T constrained by Sortable[E]) as E value in map index
	s.members[member] = struct{}{}
}

func (s MapSet[T, E]) Size() int {
	return len(s.members)
}

func (s MapSet[T, E]) Contains(member T) bool {
    // error: cannot use member (variable of type T constrained by Sortable[E]) as E value in map index
	_, found := s.members[member]
	return found
}

func (s MapSet[T, E]) Sorted() []T {
	sorted := make([]T, 0, len(s.members))
	for member := range s.members {
        // error: cannot use member (variable of type T constrained by Sortable[E]) as E value in map index
		sorted = append(sorted, member)
	}

	sort.Slice(sorted, func(i, j int) bool {
        // cannot use sorted[j] (variable of type T constrained by Sortable[E]) as E value in argument to sorted[i].Less
		return sorted[i].Less(sorted[j])
	})

	return sorted
}

Can I just do a type assertion to fix the errors? I don't want to do that, but maybe it's not possible to do better. But nope, that doesn't work either.

func (s MapSet[T, E]) Add(member T) {
    // invalid operation: cannot use type assertion on type parameter value member (variable of type T constrained by So
rtable[E])
	s.members[member.(T)] = struct{}{}
}

I got stuck here for a long time, swapping out references to T and E to try to get it to compile everywhere. But this solution just doesn't work. The map must be keyed by comparable E, which means you can't return the Sortable type where required by the Sortable interface. I also experimented with the Sortable interface to return the underlying comparable E instead, but that's also not what I wanted. Back to the drawing board.

False start #2: self-referential type definition without comparable

It turns out that using two type parameters like the functions in the slices package was a dead-end. Instead, I just needed to realize that type definitions can be self-referential, like this:

type SortableSet[T Sortable[T]] interface {
	Add(member T)
	Size() int
	Contains(member T) bool
	Sorted() []T
}

That looks good, but it doesn't quite compile:

T does not satisfy comparable

Let's get rid of the comparable constraint on Sortable to fix the problem.

type Sortable[T any] interface {
	Less(member T) bool
}

Now the SortableSet interface compiles fine. Maybe we don't need comparable after all. On to the concrete collection type. This looks good, but also doesn't compile:

type MapSet[T Sortable[T]] struct {
    // invalid map key type T (missing comparable constraint)
	members map[T]struct{}
}

It's the same map key problem again. For some reason, T is not considered a valid key, even though it's an interface type, and interfaces are always valid map keys. But this does work:

type MapSet[T Sortable[T]] struct {
	members map[Sortable[T]]struct{}
}

This made me feel weird, but it compiled, so I kept on with the implementation. Here are the rest of the methods:

func NewMapSet[T Sortable[T]]() SortableSet[T] {
	return MapSet[T]{
		members: make(map[Sortable[T]]struct{}),
	}
}

func (s MapSet[T]) Add(member T) {
	s.members[member] = struct{}{}
}

func (s MapSet[T]) Size() int {
	return len(s.members)
}

func (s MapSet[T]) Contains(member T) bool {
	_, found := s.members[member]
	return found
}

func (s MapSet[T]) Sorted() []T {
	sorted := make([]T, 0, len(s.members))
	for member := range s.members {
		// cast required here?
		sorted = append(sorted, member.(T))
	}

	sort.Slice(sorted, func(i, j int) bool {
		return sorted[i].Less(sorted[j])
	})

	return sorted
}

This all looks great, except for this line in the Sorted method:

	for member := range s.members {
		// cast required here?
		sorted = append(sorted, member.(T))
	}

What's going on here? I thought that type assertions with parameterized types don't work? But not only does this work, it's necessary. Leaving it off gives me this very specific error message.

cannot use member (variable of type Sortable[T]) as T value in argument to append: need type assertion

Thinking through this, it makes a certain amount of sense: we know that T is constrained to be a Sortable[T], which means that this assertion must be true. But since Go is statically typed, we can't use Sortable[T] when the function signature for append requires T. It needs the cast.

I was happy that I had gotten something working at all, but I was still frustrated by the necessity of the type assertion. A good generic type system shouldn't require that. I decided to dig a bit more.

More digging: let's try with a slice

Maybe I was just getting caught up on a weird edge case related to map keys. I definitely noticed that all the generics tutorials I had read online steered clear of maps entirely, just focusing on slices. So maybe this would work better with a slice?

Here's a SortableSet implementation that uses a slice for storage instead.

type SliceSet[T Sortable[T]] struct {
	members []T
}

func NewSliceSet[T Sortable[T]]() SortableSet[T] {
	return &SliceSet[T]{
		members: make([]T, 0),
	}
}

func (s *SliceSet[T]) Add(member T) {
	if !s.Contains(member) {
		s.members = append(s.members, member)
	}
}

func (s SliceSet[T]) Size() int {
	return len(s.members)
}

func (s SliceSet[T]) Contains(member T) bool {
	for _, m := range s.members {
        if m == member {
			return true
		}
	}
	return false
}

func (s SliceSet[T]) Sorted() []T {
	sorted := make([]T, len(s.members))
	copy(sorted, s.members)

	sort.Slice(sorted, func(i, j int) bool {
		return sorted[i].Less(sorted[j])
	})

	return sorted
}

Another error, this time in the Contains method.

invalid operation: m == member (incomparable types in type set)

It's complaining about this code:

	for _, m := range s.members {
        if m == member { // error here
			return true
		}
	}

OK, I can work around this, let's redefine our own equality operation using Less: two elements must be equal if neither is less than the other.

func (s SliceSet[T]) Contains(member T) bool {
	for _, m := range s.members {
		// can't use == here
		if !m.Less(member) && !member.Less(m) {
			return true
		}
	}
	return false
}

This works as expected, but it left me feeling deeply unsettled. Something wasn't right here. Either I was doing something wrong and I didn't understand what, or the Go generic system was kind of busted. Long experience led me to strongly suspect the former conclusion was the correct one.

The secret syntax of generic type constraints

I had a deep suspicion that I couldn't get my code to work because of how I was expressing the type constraints. The issue was: I needed a way to declare that the elements in the Set were of type Sortable that also were comparable. If I could figure out how to do that, then it would fix both problems: I could use them as map keys, and == would work. But how?

I had already tried to add the comparable constraint to my Sortable interface, like this:

type Sortable[T any] interface {
	comparable
	Less(member T) bool
}

But when I did this, any place a var of Sortable type is declared got this error:

cannot use type Sortable[Name] outside a type constraint: interface is (or embeds) comparable

This error message is tantalizing: does it mean that there is a type constraint to express this?

Very determined googling eventually led me to this blog about a detail in the release of Go 1.20 that involves the comparable constraint. It defines an interface that looks like exactly what I wanted: it combines a type constraint with normal interface method declaration.

interface {
    ~int | ~string
    io.Writer
}

The blog explains what this means (emphasis mine):

This interface defines the set of all types whose underlying types are either int or string and that also implement io.Writer’s Write method.

Such generalized interfaces can't be used as variable types. But because they describe type sets they are used as type constraints, which are sets of types.

For instance, we can write a generic min function

func min[P interface{ ~int64 | ~float64 }](x, y P) P

which accepts any int64 or float64 argument. (Of course, a more realistic implementation would use a constraint that enumerates all basic types with an < operator.)

As an aside, because enumerating explicit types without methods is common, a little bit of syntactic sugar allows us to omit the enclosing interface{}, leading to the compact and more idiomatic

func min[P ~int64 | ~float64](x, y P) P { … }

This last bit blew my mind a little bit. Every tutorial I read used the latter syntax, but these two forms are equivalent. Here they are side by side.

func min[P interface{ ~int64 | ~float64 }](x, y P) P

func min[P ~int64 | ~float64](x, y P) P

This was the key bit of insight I needed to understand what was going on: type constraints are declared as interfaces. The idiomatic version hides this from you with syntactic sugar, but that's what's happening. It's actually a little more complex than that: the interface declaration above doesn't declare a type, it declares a type set.

The blog goes on:

A type T satisfies a constraint C if

  • T implements C; or
  • C can be written in the form interface{ comparable; E }, where E is a basic interface and T is comparable and implements E.

More new syntax! Well, not really new -- it's the inline form of this:

interface {
    comparable
    E
}

The ; is necessary because it's all on one line. Does this mean that I can define a type constraint in the same way to solve my problem? Yes!

The solution

With this knowledge in hand, I was now able to write the full generic collection solution I had wanted to all along. This solution has everything I want: no type assertions anywhere, and I can use the generic type elements as map keys and compare them with ==.

type Sortable[T comparable] interface {
	Less(member T) bool
}

type SortableSet[T interface{Sortable[T]; comparable}] interface {
	Add(member T)
	Size() int
	Contains(member T) bool
	Sorted() []T
}

type MapSet[T interface{Sortable[T]; comparable}] struct {
	members map[T]struct{}
}

func NewMapSet[T interface{Sortable[T]; comparable}]() SortableSet[T] {
	return MapSet[T]{
		members: make(map[T]struct{}),
	}
}

type SliceSet[T interface{Sortable[T]; comparable}] struct {
	members []T
}

func NewSliceSet[T interface{Sortable[T]; comparable}]() SortableSet[T] {
	return &SliceSet[T]{
		members: make([]T, 0),
	}
}

I've omitted everything but the type definitions and constructor funcs in the snippet above, but you can find the entire working implementation, along with a main method that tests it, here.

It's also possible to declare the type constraint as its own type, like so:

type SortableConstraint[T comparable] interface {
	comparable
	Sortable[T]
}

Then you can use that interface name in method signatures, instead of the more verbose interface{} syntax:

func NewMapSet[T SortableConstraint[T]]() SortableSet[T] {
	return MapSet[T]{
		members: make(map[T]struct{}),
	}
}

Whether you do this or not is probably mostly a matter of style and how often the constraint is needed, but I like it better.

It's also possible to declare the type set for the constraint in the function declaration using multiple lines, without the use of ;:

func NewSliceSet[T interface{
	Sortable[T]
	comparable}]() SortableSet[T] {
	return &SliceSet[T]{
		members: make([]T, 0),
	}
}

The Go compiler is kind of picky about where the newlines can go, though.

Takeaways

Here are the key takeaways you need to understand to create your own generic collection types in Go:

  • Don't introduce a second type parameter for a collection of a single type! Methods that use your generic collection element type should have a self-referential type constraint, e.g. T Sortable[T].
  • To use a generic type as a map key or in == comparisons, its type constraint must include comparable.
  • Declare a type set for your generic collection type and use it as the type constraint for methods. This must be a separate interface{} declaration from the generic collection element interface itself, and cannot be used as the type of any var. Type sets can themselves be parameterized, and they probably need to be for this use case.

Commentary

I'm glad I went through this exercise, because it taught me a lot about generics that I had not known. Things which none of the top 20 search results for "Golang generics" mention, at all. Which is kind of ridiculous, isn't it? Isn't defining your own constraint types fundamental to building a generic collection type, which is one of the most prominent uses of generics? I strongly suspect the reason none of these top-ranked tutorials do this, and why they avoid using maps in their examples, is because they couldn't figure out how to make it work. I get it, I couldn't either!

You have to read the 23,000 word change proposal pretty closely to pick up on the syntax necessary to combine a built-in constraint like comparable with your own generic type (and there are no examples of its use in those 23,000 words). Or I guess you could always read the even longer language spec, but honestly,

who does that

I guess I'm the kind of guy who does that, the next time I run into something like this. The spec actually has some very useful info not mentioned in any tutorial article, like this incredibly handy table of constraint satisfaction examples:

type argument      type constraint                // constraint satisfaction

int                interface{ ~int }              // satisfied: int implements interface{ ~int }
string             comparable                     // satisfied: string implements comparable (string is strictly comparable)
[]byte             comparable                     // not satisfied: slices are not comparable
any                interface{ comparable; int }   // not satisfied: any does not implement interface{ int }
any                comparable                     // satisfied: any is comparable and implements the basic interface any
struct{f any}      comparable                     // satisfied: struct{f any} is comparable and implements the basic interface any
any                interface{ comparable; m() }   // not satisfied: any does not implement the basic interface interface{ m() }
interface{ m() }   interface{ comparable; m() }   // satisfied: interface{ m() } is comparable and implements the basic interface interface{ m() }

Sure wish I had found it earlier, and known what I was looking at.

The real lesson here, for me: when something seems like it should be possible in a programming language, but you can't find any working examples anywhere, it's time to suck it up and read the specification. You'll save time in the long run.

Conclusion

Trying to write a generic collection type in Go was more frustrating than it should have been. Google search doesn't turn up any working examples to draw from, and the official Go documentation, including the generics proposal, doesn't either. Hopefully this blog finds somebody who needs it as badly as I did.

Have questions about Go generics? Or maybe you are curious about the world's first version-controlled SQL database? Join us on Discord to talk to our engineering team and other Dolt users.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.