Dolt CLI Discovers the Network

July 21, 2023

6 min read

Git for Data is at the root of DoltHub's DNA, and we believe that the Dolt Command Line Interface (CLI) should behave similarly to git. People familiar with git sit down to use the Dolt CLI, and it's impossible to miss the similarities. Data is not exactly like source code though, and there are significant barriers when we consider the limitations of moving large amounts of data across networks. For Git, the largest repositories are on the order of ~10GB. If you go bigger than that, you're probably doing it wrong. When it comes to data, having a TB of data is commonplace. Having much more than that is perfectly reasonable. This runs counter to the git model of every developer having a copy of the repository. Until yesterday, the only network capabilities the CLI had was to copy storage files using pull and push, and all other commands assumed they operated on local data. Release 1.8.1 changes that and introduces many other Dolt CLI commands to the network!

Last month I wrote about the changes to use SQL for the CLI backing data. The TL;DR is that we are updating the Dolt CLI to use SQL to retrieve its data so that we can improve the Dolt experience for developers who want to both work in SQL and on the command line. Another benefit of migrating to SQL is that with fairly little effort we are able to solve the remote data problem. You can now connect to remote instances of dolt sql-server using the Dolt CLI. This comes with a bunch of caveats since this is still a work in progress - read on for more details!

What it Looks Like

Starting with Dolt Release 1.8.1 new global configuration options are available:

$ dolt --help
[...snip...]
    --host=<host>
      Defines the host to connect to.
    --port=<port>
      Defines the port to connect to. Only used when the --host flag is also provided. Defaults to `3306`.
    --no-tls
      Disables TLS for the connection to remote databases.

These are "global arguments" which come before the subcommand. Here is a screenshot of my terminal running status and diff (look on the right), and I'm running them against dolthub-cli-demo.dbs.hosted.doltdb.com:

Under the covers, the Dolt CLI is using the --host flag to signal that a remote Dolt sql-server should be used, instead of a local Dolt database. Using client credentials, a connection to that remote Dolt sql-server is made and a new session is created. In the example above, I've set the DOLT_CLI_USER and DOLT_CLI_PASSWORD environment variables for credentials, but there are also flags --user and --password. Once the connection is made, the CLI command can run any number of SQL queries, and commit a SQL transaction when appropriate. When the CLI command completes, the session is terminated.

For example, here is what a hosted workbench looks like for the diff shown above.

If we run the add and commit commands in the CLI, the server will immediately update. Each command individually connects to the server and executes the appropriate stored procedure.

My hosted workbench immediately has the commit created by the CLI invocation (page reload required).

Getting Your Server Setup (FREE)

If you don't have a dolt server to play with, it's pretty easy to start one. In a terminal do the following:

$ dolt sql-server --user youruser --password yourpass
Starting server with Config HP="localhost:3306"|T="28800000"|R="false"|L="info"|S="/tmp/mysql.sock"
2023-07-20T14:45:52-07:00 INFO [no conn] Server ready. Accepting connections. {}

Leave that running in a terminal window so you can watch the logs. Also, username and password are optional; the server and Dolt CLI use 'root' with no password by default. Finally, your local server doesn't have TLS enabled, so you pass in the --no-tls flag to the sql command:

$ dolt --host localhost --user youruser --password yourpass --no-tls sql
# Welcome to the DoltSQL shell.
# Statements must be terminated with ';'.
# "exit" or "quit" (or Ctrl-D) to exit.
> CREATE DATABASE demo;
> USE demo;
Database changed
demo> CREATE TABLE tbl (id int primary key, text varchar(255));
demo> exit;
Bye

Getting Your Server Setup ($$$)

If you would rather use a managed database, head over to Hosted Dolt, and then create a deployment. Be sure to use a Web PKI Certificate, this ensures that TLS will work correctly when you attempt to connect to your DB. Hosted Dolt does not support unencrypted connections, so having a valid certificate is necessary to make this work.

After the deployment is complete, you'll get all the connection details you need to setup your CLI.

Once you have your host and credentials, you can use the sql command to connect to the DB. The sql command is the only one which does not require a --use-db flag. This allows you to connect to an empty server and create a database:

$ dolt --host <host> --user <user> --password <password> sql
> CREATE DATABASE demo;
> USE demo;
demo> CREATE TABLE tbl (id int primary key, text varchar(255));
demo> exit;
Bye

Glory Time

You've run the sql command, and it's just one more step to run other commands. Replace sql with --use-db demo <command name>. One additional thing you may want to do is set the DOLT_CLI_USER and DOLT_CLI_PASSWORD environment variables.

$ export DOLT_CLI_USER="<user>"
$ export DOLT_CLI_PASSWORD="**************"

Another option is to not specify your password, and in that case you'll be prompted for it.

Now you can use the supported commands to see what is going on in your database!

$ dolt --host <host> --use-db demo status
On branch main
Untracked tables:
  (use "dolt add <table>" to include in what will be committed)
        new table:        tbl
$ dolt --host <host> --use-db demo diff
diff --dolt a/tbl b/tbl
added table
+CREATE TABLE `tbl` (
+  `id` int NOT NULL,
+  `text` varchar(255),
+  PRIMARY KEY (`id`)
+) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_bin;

Commands Migrated

Before you go wild and start trying to run all of your workflows against a remote server, be aware that only about one third of Dolt's commands are enabled to do this. dolt --help lists more than 25 commands. This is the list filtered for the commands which currently are able to run against a remote instance:

              status - Show the working tree status.
                 add - Add table changes to the list of staged table changes.
                diff - Diff a table.
               reset - Remove table changes from the list of staged table changes.
               clean - Remove untracked tables from working set.
              commit - Record changes to the repository.
                 sql - Run a SQL query against tables in repository.
                show - Show information about a specific commit.
              branch - Create, list, edit, delete branches.
           conflicts - Commands for viewing and resolving merge conflicts.
         cherry-pick - Apply the changes introduced by an existing commit.
              revert - Undo the changes introduced in a commit.
                 tag - Create, list, delete tags.
               blame - Show what revision and author last modified each row of a table.

We started with commands which require no session persistence and were critical to basic usage. Notably missing are checkout and merge. These two commands have been migrated to using SQL for the backend, but they both require a session to really be useful. For example, if you checked out a different branch on a remote instance, your next command would connect with a new session and you will be on the default branch. That's a problem we are going to solve in the next phase of this project. Also worth noting that cherry-pick and revert would benefit from having sessions because we could build a workflow to resolve conflicts. Do I hear rebase requests, anyone?

There are many commands which are not enabled yet. If you attempt to use the --host argument with those commands, you will get an error message letting you know. I strongly encourage you to jump on Discord and tell us which commands should be prioritized next. I can't wait for log to be supported!

Rough Edges?!?

Yes! There are some usability issues with this feature as it stands. You can't select a branch yet, which is a blocker for many use cases (Stephanie is working on it!). Forcing you to specify the --host and --use-db flags is also not great. A no-brainer would be to enable you to configure a default, and possibly a --profile option to allow you to pick a different configuration set. Finally, we require you to have a trusted SSL certificate, which is more secure but not always what is possible. We can be less strict there too.

We will continue to improve the experience of the CLI. The order in which we do it will be highly influenced by our users. We love hearing from people who are using Dolt! File an issue with us or visit in discord to let us know where we should apply our efforts first.

Blog