Improving Dolt's getting-started experience with config templating

FEATURE RELEASE
4 min read

Dolt is the world's first and only version-controlled SQL database. But it didn't start its life as a version-controlled database. It started as a command line tool for sharing datasets. And that history has shaped how you install and get started with Dolt.

Today we're excited to announce changes to Dolt's getting started process to make it easier to configure and run Dolt as you build your application on top of it.

Installing Dolt

Dolt's installation process is really easy: you just download a single binary, put it on your $PATH, and run dolt anywhere you want. It's not an exaggeration to say there isn't an installation process, it's just a program you download and run.

To run a database server, you just do this:

dolt sql-server

This is great for getting people started quickly, but it has drawbacks. There's a reason that almost every other database server has an installer process that forces you to make several important decisions (such as your root password, where to put data files, what port to run on, etc.). It also puts your database installation in a standard location (or one you choose) and puts a bunch of config files there too for you to mess with later.

Dolt doesn't do any of that, because it doesn't have an installer. But we want to make Dolt easier to get started with by doing some of the setup work that other databases do at installation time. We're slowly moving in that direction.

Initializing settings automatically when the server starts

As of last release, Dolt now initializes a super-user for the database the first time you run the server. Prior to this change, a new database server wouldn't have any users or permissions persisted until the first time someone runs a CREATE USER statement, at which point the implicit root user would no longer work. That was bad behavior, so we fixed the bug.

Similarly, Dolt suffered from not having an installer by not getting a config file with all the default settings filled in for you to tweak later on. If you've ever installed Postgres or MySQL, you know that there's a config file you can look for somewhere in the installation location. Postgres's looks like this (in part):

# -----------------------------
# PostgreSQL configuration file
# -----------------------------
#
# This file consists of lines of the form:
#
#   name = value
#
# (The "=" is optional.)  Whitespace may be used.  Comments are introduced with
# "#" anywhere on a line.  The complete list of parameter names and allowed
# values can be found in the PostgreSQL documentation.
#
# The commented-out settings shown in this file represent the default values.
# Re-commenting a setting is NOT sufficient to revert it to the default value;
# you need to reload the server.
#
# This file is read on server startup and when the server receives a SIGHUP
# signal.  If you edit the file on a running system, you have to SIGHUP the
# server for the changes to take effect, run "pg_ctl reload", or execute
# "SELECT pg_reload_conf()".  Some parameters, which are marked below,
# require a server shutdown and restart to take effect.
#
# Any parameter can also be given as a command-line option to the server, e.g.,
# "postgres -c log_connections=on".  Some parameters can be changed at run time
# with the "SET" SQL command.
#
# Memory units:  B  = bytes            Time units:  us  = microseconds
#                kB = kilobytes                     ms  = milliseconds
#                MB = megabytes                     s   = seconds
#                GB = gigabytes                     min = minutes
#                TB = terabytes                     h   = hours
#                                                   d   = days


#------------------------------------------------------------------------------
# FILE LOCATIONS
#------------------------------------------------------------------------------

# The default values of these variables are driven from the -D command-line
# option or PGDATA environment variable, represented here as ConfigDir.

#data_directory = 'ConfigDir'		# use data in another directory
					# (change requires restart)
#hba_file = 'ConfigDir/pg_hba.conf'	# host-based authentication file
					# (change requires restart)
#ident_file = 'ConfigDir/pg_ident.conf'	# ident configuration file
					# (change requires restart)


# If external_pid_file is not explicitly set, no extra PID file is written.
#external_pid_file = ''			# write an extra PID file
					# (change requires restart)

The great thing about this file is it's self-documenting. If you can find the postgres.conf file somewhere on disk, you can just read it to see what configuration settings are possible and their default values. It's great!

Dolt saw what Postgres had, and wanted it. Thanks to a great community contribution, now when you run dolt sql-server for the first time you'll get a config.yaml that looks like this:

# Dolt SQL server configuration
#
# Uncomment and edit lines as necessary to modify your configuration.
# Full documentation: https://docs.dolthub.com/sql-reference/server/configuration
#

# log_level: info

# max_logged_query_len: 0

# encode_logged_query: false

# behavior:
  # read_only: false
  # autocommit: true
  # disable_client_multi_statements: false
  # dolt_transaction_commit: false
  # event_scheduler: "OFF"

# user:
  # name: ""
  # password: ""

# listener:
  # host: localhost
  # port: 3306
  # max_connections: 100
  # read_timeout_millis: 28800000
  # write_timeout_millis: 28800000
  # tls_key: key.pem
  # tls_cert: cert.pem
  # require_secure_transport: false
  # allow_cleartext_passwords: false
  # socket: /tmp/mysql.sock

# data_dir: .

# cfg_dir: .doltcfg

# remotesapi:
  # port: 8000
  # read_only: false

# privilege_file: .doltcfg/privileges.db

# branch_control_file: .doltcfg/branch_control.db

# user_session_vars:
# - name: root
  # vars:
    # dolt_log_level: warn
    
# system_variables:
  # dolt_log_level: info
  # dolt_transaction_commit: 1

# metrics:
  # host: localhost
  # port: 9091

# cluster:
  # standby_remotes:
  # - name: standby_replica_one
    # remote_url_template: https://standby_replica_one.svc.cluster.local:50051/{database}
  # - name: standby_replica_two
    # remote_url_template: https://standby_replica_two.svc.cluster.local:50051/{database}
  # bootstrap_role: primary
  # bootstrap_epoch: 1
  # remotesapi:
    # address: 127.0.0.1
    # port: 50051
    # tls_key: remotesapi_key.pem
    # tls_cert: remotesapi_chain.pem
    # tls_ca: standby_cas.pem
    # server_name_urls:
    # - https://standby_replica_one.svc.cluster.local
    # - https://standby_replica_two.svc.cluster.local
    # server_name_dns:
    # - standby_replica_one.svc.cluster.local
    # - standby_replica_two.svc.cluster.local

This is a huge improvement! Now the process of configuring your Dolt SQL server is easily discoverd and mostly self-documenting.

Future work

We want every feature of Dolt to be self-documenting and easy to discover, but we have a long way to go. So far we've resisted the impulse to require an installer process for Dolt, but over time it's likely we'll offer that as an option, especially for people who primarily run Dolt as a long-running process (most of our customers).

Conclusion

Dolt is getting more useful and easier to use as time goes on. Try it out today! Or come by our Discord to talk to our engineering team and meet other Dolt users.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.