Understanding Tags in Go
In Go, tags allow developers to attach metadata to struct fields. These tags can drive features and behaviors in various libraries and tools which access the tags via reflection. This article provides an overview of tags in Go, including their syntax, common use cases, and best practices. Then we will dive into how Hosted Dolt uses them to drive serialization of server configuration that is appropriate for the version of Dolt being run.
Anatomy of Tags
Go tags are attached to struct fields. The common convention is to structure your tags as key value pairs. A field may have more than one tag, and the value of a tag is a string. Tags are enclosed in backticks and are placed immediately after the field name.
type Example struct {
Field1 int `tag1:"value1" tag2:"value2"`
Field2 string `tag1:"value3,value4"`
}
Here, Field1
has two tags, tag1
and tag2
, with values "value1"
and "value2"
respectively. Field2
has a single
tag, tag1
, with a value of "value3,value4"
.
Common Use Cases
Typically, a tag's name will correspond with a single feature or behavior and often each field in the struct will have
a tag with the same name. As an example, json
, yaml
, and db
are common tag names used to specify how a field should
be handled by a JSON encoder, YAML encoder, or database ORM respectively.
Serialization and Deserialization
One of the most common use cases for tags in Go is in serialization and deserialization tasks. Libraries like encoding/json
,
encoding/xml
, and encoding/yaml
use tags to map struct fields to data fields during encoding and decoding operations.
For example, consider a struct representing a person:
type Person struct {
FirstName string `json:"first_name"`
LastName string `json:"last_name"`
DOB *time.Time `json:"dob,omitempty"`
Address *string `json:",omitempty"`
Weight int `json:"-"`
HeightInches int
}
In this example, the json
tags specify how the exported struct fields should be serialized and deserialized. The specifics of how
the values of the json
tags are interpreted are defined by the encoding/json
package (See here for details).
In this case the value of the tag is a comma separated list. The first list value is the name of the field when serialized
as a JSON object. A value of -
for the name tells the serializer not to include this field in the serialized output. A
second value of omitempty
option tells the JSON encoder to omit the field if it is nil. If any field does not specify
the field name for the JSON object, then a default name is used which is the field name in the struct. Additional formatting
options can be specified in the tag value.
Here is an example serializing an instance of the Person
struct and outputting its value.
func main() {
p := Person{
FirstName: "Peter",
LastName: "Griffin",
Weight: 298,
HeightInches: 67,
}
data, err := json.Marshal(p)
if err != nil {
panic(err)
}
fmt.Println(string(data))
}
Which will output:
{
"first_name": "Peter",
"last_name": "Griffin",
"HeightInches": 67
}
You can play with this more here. The details covered here are specifically for json
serialization. Please see the documentation of the specific package you are using for details of supported tag features. (As an example,
here is the documentation of the yaml
tag format used by gopkg.in/yaml.v2
).
Database Mapping
ORMs (Object-Relational Mappers) like Gorm, dbr and SQLx use tags to automatically generate SQL queries and map database rows to struct fields.
type Product struct {
ID uint `gorm:"primary_key"`
Name string `gorm:"size:255"`
Price float64
Quantity int
}
In this example, the gorm
tags specify that the ID
field is the primary key, the Name
field should be mapped to a
column with a size of 255 characters, and the other fields should be mapped based on their names and types.
Validation
Tags can also be used for data validation. Validation libraries like validator can interpret tags to enforce validation rules on struct fields. For instance, you can specify constraints such as minimum and maximum values, required fields, and regular expressions using tags.
type User struct {
Username string `validate:"required,min=5,max=20"`
Email string `validate:"required,email"`
Age int `validate:"gte=18"`
}
Here, the validate
tag defines rules such as the Username
being required and having a length between 5 and 20
characters, the Email
being required and in a valid email format, and the Age
being greater than or equal to 18. We
can then create a new instance of the validator
and call Struct
to validate the struct.
func main() {
user := User{
Username: "peter",
Email: "peter@fake.horse",
Age: 17,
}
err := validator.New().Struct(user)
if err != nil {
fmt.Println("validation failed:", err)
} else {
fmt.Println("validation passed")
}
}
Which will output:
validation failed: Key: 'User.Age' Error:Field validation for 'Age' failed on the 'gte' tag
You can play with this more here.
A Problem solved with Custom Tags
Dolt uses a yaml configuration file
to configure the dolt sql-server
command. The parser for this configuration file uses strict parsing rules to ensure
that the configuration is valid, and that if a field is included it must be known by Dolt,
or fail to start. This is done to ensure that every item an end user puts in the configuration file is used and a user
doesn't get different behavior than expected based on their configuration. This can happen if the version of Dolt
being run is older than the documentation being used to configure it. However, strict parsing caused issues for Hosted Dolt.
Hosted Dolt is a service that runs Dolt for users. Each Dolt cluster is deployed and updated independently. This means that across the Hosted Dolt fleet there can be many different versions of Dolt running, and the features and configuration options available can vary between them. When a user makes a configuration change on the Hosted Dolt website, the configuration file needs to be updated using the serialization format that is appropriate for the version of Dolt that is running.
The application that writes out the configuration file imports the github.com/dolthub/dolt/go/cmd/dolt/commands/sqlserver
package
and uses the SerializeConfigForVersion
function to serialize the configuration. This function takes a sqlserver.YamlConfig
and a version number and returns a string that is the serialized configuration.
Implementing SerializeConfigForVersion
SerializeForConfigVersion uses the package gopkg.in/yaml.v2
to serialize the configuration as yaml. As we saw previously
mapping between go and our serialized format is done via tags. Here the sqlserver.YamlConfig
struct is annotated with
yaml
tags like so:
// YAMLConfig is a ServerConfig implementation which is read from a yaml file
type YAMLConfig struct {
LogLevelStr *string `yaml:"log_level,omitempty"`
MaxQueryLenInLogs *int `yaml:"max_logged_query_len,omitempty"`
EncodeLoggedQuery *bool `yaml:"encode_logged_query,omitempty"`
BehaviorConfig BehaviorYAMLConfig `yaml:"behavior"`
...
SystemVars_ *engine.SystemVariables `yaml:"system_variables,omitempty" minver:"1.11.1"`
In order to support serialization for specific versions we add a minver
tag to new struct fields. This tag specifies
the minimum version of Dolt that supports the field. This allows us to add new fields to the configuration
struct and have them serialized only when the version of Dolt being run is new enough to support them. During
development new fields are added with a minver
of TBD
and then updated to the correct version when the feature is
released. If a field does not have a minver
tag, then it is assumed to be supported in all versions.
In order to take the sqlselver.YAMLConfig
instance and serialize it for a specific version we need to null out any fields
that are not supported, and then serialize the struct (This requires that the field also has a valid yaml
tag with "omitempty"
in the value).
func nullUnsupported(verNum uint32, st any) error {
const tagName = "minver"
// use reflection to loop over all fields in the struct st
// for each field check the tag "minver" and if the current version is less than that, set the field to nil
t := reflect.TypeOf(st)
if t.Kind() == reflect.Ptr {
t = t.Elem()
}
// Iterate over all available fields and read the tag value
for i := 0; i < t.NumField(); i++ {
// Get the field, returns https://golang.org/pkg/reflect/#StructField
field := t.Field(i)
// Get the field tag value
tag := field.Tag.Get(tagName)
Here, our nullUnsupported
function uses reflection to loop over all fields in the struct and get the minver
tag value
for each struct field.
if tag != "" {
// if it's nullable check to see if it should be set to nil
if field.Type.Kind() == reflect.Ptr || field.Type.Kind() == reflect.Slice || field.Type.Kind() == reflect.Map {
var setToNull bool
if tag == "TBD" {
setToNull = true
} else {
minver, err := version.Encode(tag)
if err != nil {
return fmt.Errorf("invalid version tag '%s' on field '%s': %w", tag, field.Name, err)
}
setToNull = verNum < minver
}
if setToNull {
// Get the field value
v := reflect.ValueOf(st).Elem().Field(i)
v.Set(reflect.Zero(v.Type()))
}
} else {
return fmt.Errorf("non-nullable field '%s' has a version tag '%s'", field.Name, tag)
}
var hasOmitEmpty bool
yamlTag := field.Tag.Get("yaml")
if yamlTag != "" {
vals := strings.Split(yamlTag, ",")
for _, val := range vals {
if val == "omitempty" {
hasOmitEmpty = true
break
}
}
}
if !hasOmitEmpty {
return fmt.Errorf("field '%s' has a version tag '%s' but no yaml tag with omitempty", field.Name, tag)
}
}
If the value of our minver
tag is not empty we check to see if the field is nullable. If it is, we check to see if the
field should be set to nil based on the version of Dolt being run (Or if the minver
is TBD
). If it is
nullable and should be set to nil we use reflection to update the value. The rest of the code here validates the tags are
applied to the fields properly, and if they are not it returns an error.
After checking if a field should be set to nil, we check to see if we need to recurse into the field and perform the same operation. This is done for fields that are pointers to structs, structs, and slices of structs.
v := reflect.ValueOf(st).Elem().Field(i)
vIsNullable := v.Type().Kind() == reflect.Ptr || v.Type().Kind() == reflect.Slice || v.Type().Kind() == reflect.Map
if !vIsNullable || !v.IsNil() {
// if the field is a pointer to a struct, or a struct, or a slice recurse
if field.Type.Kind() == reflect.Ptr && field.Type.Elem().Kind() == reflect.Struct {
err := nullUnsupported(verNum, v.Interface())
if err != nil {
return err
}
} else if field.Type.Kind() == reflect.Struct {
err := nullUnsupported(verNum, v.Addr().Interface())
if err != nil {
return err
}
} else if field.Type.Kind() == reflect.Slice {
if field.Type.Elem().Kind() == reflect.Ptr && field.Type.Elem().Elem().Kind() == reflect.Struct {
for i := 0; i < v.Len(); i++ {
err := nullUnsupported(verNum, v.Index(i).Interface())
if err != nil {
return err
}
}
} else if field.Type.Elem().Kind() == reflect.Struct {
for i := 0; i < v.Len(); i++ {
err := nullUnsupported(verNum, v.Index(i).Addr().Interface())
if err != nil {
return err
}
}
}
}
}
The full code can be found here.
Conclusion
Go tags are a powerful feature that can be used to drive features and behaviors in various libraries and tools. Embedding metadata in your code which can be accessed through reflection can be a powerful tool for building flexible and extensible systems. In the case of Hosted Dolt it allowed us to add new fields to our configuration struct and serialize them only when the version of Dolt being run was new enough to support them. This allows us to add new features to Dolt without breaking existing deployments.
At DoltHub Inc our primary development language is Go. Dolt is written in Go, as are many of our tools and backend systems. If you are interested in talking about Go, or contributing to our open source Go projects, please reach out to us on Discord. We'd love to hear from you.