How To Use Dolt SQL Remotes
TLDR
We recently introduced a new Dolt feature: interacting with remotes from SQL server sessions.
Dolt is a modern relational database that supports Git style fetch, pull, and push semantics.
Although originally developed for the command line, we now support remote commands in
application databases via SQL functions. You can view remotes info with the
dolt_remotes
system table, and interact with your remotes via the SQL
functions dolt_pull()
, dolt_push()
, and dolt_fetch()
. This gives Dolt
applications connected to the database full control of remote
interactions. Imagine the possibilities!
What are Remotes?
Git defines several objects and actions for communicating state between servers.
The objects:
-
Remote: a common communication endpoint for multiple clients to share state (ex:
file:///Users/max-hoffman/Documents/dolthub/scratch/test/remote
) -
Refspec: a namespace identifier that is cognizant of remotes (ex:
refs/heads/master:refs/remotes/origin/master
is a refspec that maps our localmaster
branch to a corresponding remote namedorigin/master
) -
Commit: a unit of persisted work, which in Dolt is comprised of chunk files
The actions:
-
Push: update the remote state with local changes
-
Pull: update a local version from a shared remote state
-
Clone: initialize a new repository from a previously saved shared state
The Dolt CLI duplicates Git's behavior for remote commands. For
example, dolt push origin master
pushes the local master refspec to
the remote named "origin", the same as git push origin master
but with
data.
Dolt can also be used as an application database server, where users
connect to the database with a MySQL client. Before today's change,
users would have to stop a SQL-server to dolt push
or dolt pull
data, and resume the server afterwards.
The most recent update lets users run statements like select dolt_push(‘origin’, ‘master’)
in a SQL client. The entire lifecycle of
a Dolt database can now be controlled through SQL connections.
Demo of Remote Functions
dolt_push
, dolt_fetch
, and dolt_pull
work the same as CLI
counterparts. But before running these functions we need a Dolt
repository and remote folder:
$ mkdir -p test/{sql-repo,remote}
$ cd test/sql-repo
$ dolt init
Successfully initialized dolt data repository.
$ dolt remote add origin file://../remote
We can view the origin
remote using the dolt_remotes
table:
$ dolt sql -q "select * from dolt_remotes"
+--------+-----------------------------------------------------------------+--------------------------------------+--------+
| name | url | fetch_specs | params |
+--------+-----------------------------------------------------------------+--------------------------------------+--------+
| origin | file:///Users/max-hoffman/Documents/dolthub/scratch/test/remote | [refs/heads/*:refs/remotes/origin/*] | map[] |
+--------+-----------------------------------------------------------------+--------------------------------------+--------+
And we can push changes to our remote:
$ dolt sql -q "select dolt_push('origin', 'master')"
+-------------------------------+
| dolt_push('origin', 'master') |
+-------------------------------+
| 1 |
+-------------------------------+
To pull into our first repo, we will have to commit a change from a separate repo:
$ cd ..
$ dolt clone file://./remote cli-repo
cloning file:///Users/max-hoffman/Documents/dolthub/scratch/test/remote
2 of 2 chunks complete. 0 chunks being downloaded currently.
$ cd cli-repo
$ dolt sql -q "create table test (a int primary key)"
$ dolt commit -am “new commit”
$ dolt push origin master
Returning to our first repo, we pull the latest commit and inspect our new table:
$ cd ../sql-repo
$ dolt sql -q "select dolt_pull('origin')"
+---------------------+
| dolt_pull('origin') |
+---------------------+
| 1 |
+---------------------+
$ dolt sql -q "show tables"
+-------+
| Table |
+-------+
| test |
+-------+
You have to use select dolt_pull()
because dolt_pull()
by itself isn't
valid SQL syntax. With this SQL sleight on hand, you now control the
entire lifecycle of you data within a running server!
Why Are SQL Remotes Interesting?
Remotes are a first step towards replication. Transmitting commits to remotes allows production servers to safely recover from disk failures. We are using the same primitives as push and pull to deliver this soon.
Gracefully transitioning from a SQL server to a local CLI client is a related but unique feature. Consider migrating a database between versions. With a typical database you would either take the instance offline during a migration, or run two full instances that alternate between production and development. With a Dolt database, you can migrate your database locally, push changes to a remote, and then have the server pull and merge the migration without downtime. Because Dolt is versioned, the production instance can instantly rollback to old data and schema versions without interruption.
Future work
We are currently working on auto-replicating commits, an extension of this blog’s work. A production Dolt server will soon be able to backup commits, share changes wth collaborators, and recover from disk failures.
Remotes and branches are read-only during an ongoing sql session. The
repo_state.json
holds this configuration in a JSON file, and files
live outside of a SQL session’s purview. For a database to write remote
and branch updates, we would need to change how we encapsulate shared
database contents with private configuration files.
Lastly, Dolt does not parse as broad a range of refspecs as Git yet. Feel free to reach on on Discord if you notice a feature you'd like to see added.
Conclusion
You can now interact with Dolt remotes from SQL server clients,
previously only available in the CLI. Users can select dolt_push()
,
select dolt_pull()
, select dolt_fetch()
, and select * from dolt_remotes
. This behavior is a first step towards asynchronously
replicating Dolt application databases, but also allows for interesting
development workflows. Users can debug and update production servers
from local filesystems using Git style primitives. Those updates are
versioned, reproducible, and shareable, like everything else in Dolt.
If you would like to ask us questions about Dolt, data versioning, remotes, or anything in between reach out to us on our Discord server.