Automatic Dolt Commits with @@dolt_transaction_commit
Dolt is the world's first fully-versioned SQL relational database. It enables you to branch, fork, diff, and merge your tables, and to time travel to any point in your data's history and examine how your data and schema have changed over time. Customers come to Dolt because they want to take advantage of the versioning features Dolt provides. Some concrete examples of what customers are doing with those versioning features include developing video games, providing data/configuration versioning as an application feature, managing ML data for model training, and much, much more.
But... today I actually want to talk about how Dolt supports applications that don't care about Dolt's versioning features!
Hear me out... in the use cases I mentioned above, all of those applications are "version-aware". That means they know that Dolt is creating versions of their data and they use Dolt's data versioning features directly. For example, they execute SQL queries like call dolt_commit()
, or select * from dolt_diff(...)
as part of the application code. However, there's another large class of applications that just need to create data and read the current data. These types of applications don't actually need to directly use Dolt's versioning features, but they can still benefit from having a recorded history of all the data and schema changes happening in the database.
For example, imagine if your Metabase or Wordpress install was using a Dolt database to store all of its configuration data. Even without those applications being "version-aware", they can work with Dolt just like any other MySQL database, and Dolt can automatically create Dolt commits as the data changes. From the application's perspective, Dolt looks just like any other MySQL database, but underneath, Dolt is diligently recording how all the data is changing. If someone accidentally messes up important application data, you can examine the history, see exactly what changed, and take steps to restore your data. Or, maybe you want to run some manual, ad-hoc analytic queries on your database to understand what configuration data is changing the most, or audit who changed what in a certain time period. Dolt makes all of this possible, without any changes to your application code to make it "version-aware".
Introducing @@dolt_transaction_commit
This behavior of implicitly creating Dolt commits can be easily enabled by turning on the @@dolt_transaction_commit
system variable. By default, Dolt assumes that you want to explicitly control when Dolt commits are created – you stage your changes with dolt_add()
, then call dolt_commit()
when you're ready to record the current state as a new Dolt commit in your database's history. This should sound familiar, since it's the same model used by Git. However... this requires that your application is aware of these Dolt features and uses the commands above to explicitly create Dolt commits. Dolt provides another mode where Dolt commits are created implicitly when you enable @@dolt_transaction_commit
. In this mode, whenever a SQL commit is processed, a Dolt commit is automatically created. There are a lot of advantages for explicitly controlling the creation of Dolt commits (e.g. controlling the exact commit message and scope of the commit), but in cases where you can't easily update the application code to be "version-aware", the @@dolt_transaction_commit
mode is a great way to start integrating with Dolt and getting the benefits of having all your data and schema changes versioned, tracked, and auditable.
Using @@dolt_transaction_commit
Now that you know what the @@dolt_transaction_commit
system variable does, let's show a few easy ways to turn it on...
Explicitly setting @@dolt_transaction_commit
First up is the explicit and obvious route... since @@dolt_transaction_commit
is a system variable, you can turn it on or off directly with SET @@dolt_transaction_commit=1;
just like any other system variable. Like many system variables, @@dolt_transaction_commit
is available both at a global level and at the individual session level. You can change the global setting with SET @@global.dolt_transaction_commit=1;
, or you can omit global
and set it for each session you want to use it in. Note that if you set the global setting, only new sessions started after changing the global variable will have the new setting – existing sessions will not be updated.
set @@dolt_transaction_commit=1;
select @@global.dolt_transaction_commit, @@session.dolt_transaction_commit;
+----------------------------------+-----------------------------------+
| @@GLOBAL.dolt_transaction_commit | @@SESSION.dolt_transaction_commit |
+----------------------------------+-----------------------------------+
| 0 | 1 |
+----------------------------------+-----------------------------------+
1 row in set (0.00 sec)
It's also worth noting that global variables are not persisted across server restarts, so when you restart your sql-server you'll need to apply the setting again. If you're running and managing your own Dolt sql-server
and want to always use this setting for all connections, we recommend setting it in your sql-server configuration file so that you don't have to remember to turn it back on each time you start up your sql-server. The example below shows a sample dolt sql-server configuration file with the dolt_transaction_commit
behavior enabled.
log_level: info
behavior:
dolt_transaction_commit: true
user:
name: "account1"
listener:
host: localhost
port: 3306
max_connections: 100
Configuring in Hosted Dolt
If you're running in Hosted Dolt, there's a very easy behavior setting that controls @@dolt_transaction_commit
globally. This approach is equivalent to setting the system variable globally in your sql-server configuration file, as described above, but the advantage is that it's in the Hosted Dolt UI and provides a really easy way to enable this behavior. The same note above about this value only affecting new sessions applies here, too, so be sure to restart any sessions after turning this on the first time. The screenshot below shows where to find the @@dolt_transaction_commit
setting on the "Configuration" tab of a Hosted Dolt deployment:
Configuring via JDBC connection properties
This last one is Java-specific, but it's a pretty slick feature of the MySQL Connector/J JDBC driver so I thought it was worth calling out. If you are using the MySQL Connector/J JDBC driver to connect to Dolt, the sessionVariables
connection property is another really handy way to enable @@dolt_transaction_commit
(or any other session variable for that matter!). The big advantage here is that you can embed this setting into the JDBC URL you give your application so that every connection it makes will always have @@dolt_transaction_commit
enabled, but other sessions not using that JDBC URL will not have @@dolt_transaction_commit
enabled. This is specific to Java applications that allow you to configure a JDBC URL, but it's particularly useful if one part of your system doesn't need to be "version-aware" and you have other manual or automated processes that do want to explicitly control when Dolt commits are created. The code snippet below shows how to set @@dolt_transaction_commit
through the sessionVariables
property when creating a connection through the MySQL Connector/J JDBC driver:
String url = "jdbc:mysql://127.0.0.1:3306/myDb?sessionVariables=dolt_transaction_commit=1";
conn = DriverManager.getConnection(url, user, password);
Demo
Now that we know what @@dolt_transaction_commit
does, why it's useful, and a few ways to enable it, let's use it and take a look at the Dolt commits that it creates.
For this demo, I've used Hosted Dolt to launch a new Dolt database and after it started up, I enabled the behavior setting for @@dolt_transaction_commit
as described above. Then I grabbed the connection information from the "Connectivity" tab in Hosted Dolt, and used the MySQL client to connect to my Hosted Dolt SQL server. To start, let's confirm a few of our settings:
SELECT dolt_version(), @@autocommit, @@global.dolt_transaction_commit, @@session.dolt_transaction_commit;
+----------------+----------------------+----------------------------------+-----------------------------------+
| dolt_version() | @@SESSION.autocommit | @@GLOBAL.dolt_transaction_commit | @@SESSION.dolt_transaction_commit |
+----------------+----------------------+----------------------------------+-----------------------------------+
| 1.5.0 | 1 | 1 | 1 |
+----------------+----------------------+----------------------------------+-----------------------------------+
1 row in set (0.00 sec)
Perfect; we can see that the Hosted Dolt behavior setting we changed for the @@dolt_transaction_commit
is working and this behavior is enabled globally, including in our current session. Let's check out the current Dolt commit log to verify that we only have the initial commit that all new Dolt databases start with:
SELECT * FROM dolt_log;
+----------------------------------+-----------+-------------------------+-------------------------+----------------------------+
| commit_hash | committer | email | date | message |
+----------------------------------+-----------+-------------------------+-------------------------+----------------------------+
| bml0np84r21bsmrl9eelpuniaanl3195 | jfulghum | jason@dolthub.com | 2023-06-20 23:50:33.623 | Initialize data repository |
+----------------------------------+-----------+-------------------------+-------------------------+----------------------------+
1 row in set (0.01 sec)
Great! A clean slate. Let's create a table and insert some data into it:
CREATE TABLE myTable (id INT PRIMARY KEY, name VARCHAR(255));
INSERT INTO myTable VALUES (1, 'Jerry Maguire');
Now let's check out the Dolt commit log again and see what we've got:
SELECT * FROM dolt_log;
+----------------------------------+-----------+-------------------------+-------------------------+----------------------------+
| commit_hash | committer | email | date | message |
+----------------------------------+-----------+-------------------------+-------------------------+----------------------------+
| ovbl7vvobimer2f5ighrvgpmt9qks82b | jfulghum | jason@dolthub.com | 2023-06-20 23:57:58.647 | Transaction commit |
| 3lst872g0hvc6g8107ntmd0rc3lqfi3f | jfulghum | jason@dolthub.com | 2023-06-20 23:57:52.797 | Transaction commit |
| bml0np84r21bsmrl9eelpuniaanl3195 | jfulghum | jason@dolthub.com | 2023-06-20 23:50:33.623 | Initialize data repository |
+----------------------------------+-----------+-------------------------+-------------------------+----------------------------+
3 rows in set (0.00 sec)
Each of the two SQL statements we executed created their own Dolt commit. This is because we have the @@autocommit
system variable enabled, so every statement is implicitly its own SQL transaction unless we explicitly use START TRANSACTION
to create explicit transactions.
Let's quickly talk about how @@autocommit
interacts with @@dolt_transaction_commit
. These two system variables are closely related – @@autocommit
controls whether SQL commits are automatically created for you after each statement you execute and @@dolt_transaction_commit
controls whether a Dolt commit is automatically created for each SQL commit. By default, @@autocommit
is enabled for each session, so if you turn on @@dolt_transaction_commit
, then every statement you execute that edits data will create a SQL commit and also a Dolt commit. If @@autocommit
is disabled, then just remember that you'll have to explicitly manage SQL commits by executing a SQL commit
statement, and whenever you do, if @@dolt_transaction_commit
is enabled, it'll automatically create a Dolt commit, too.
Let's turn off @@autocommit
and show how explicit SQL transaction management works. Technically, we could still do this with @@autocommit
enabled as long as we're explicitly using START TRANSACTION
, but for the sake of this demo, let's just take @@autocommit
completely out of the picture.
SET @@autocommit = 0;
SELECT @@autocommit;
+----------------------+
| @@SESSION.autocommit |
+----------------------+
| 0 |
+----------------------+
1 row in set (0.00 sec)
START TRANSACTION;
INSERT INTO myTable VALUES (2, 'Dorothy Boyd');
INSERT INTO myTable VALUES (3, 'Rod Tidwell');
INSERT INTO myTable VALUES (4, 'Bob Sugar');
COMMIT;
Now when we look at the Dolt commit log, we should only see one new Dolt commit, since we explicitly included all three of our insert statements into a single SQL transaction.
SELECT * FROM dolt_log;
+----------------------------------+-----------+-------------------------+-------------------------+----------------------------+
| commit_hash | committer | email | date | message |
+----------------------------------+-----------+-------------------------+-------------------------+----------------------------+
| evnjujc7c9cgn8if43eiii76pvavj3fr | jfulghum | jason@dolthub.com | 2023-06-20 23:59:37.196 | Transaction commit |
| ovbl7vvobimer2f5ighrvgpmt9qks82b | jfulghum | jason@dolthub.com | 2023-06-20 23:57:58.647 | Transaction commit |
| 3lst872g0hvc6g8107ntmd0rc3lqfi3f | jfulghum | jason@dolthub.com | 2023-06-20 23:57:52.797 | Transaction commit |
| bml0np84r21bsmrl9eelpuniaanl3195 | jfulghum | jason@dolthub.com | 2023-06-20 23:50:33.623 | Initialize data repository |
+----------------------------------+-----------+-------------------------+-------------------------+----------------------------+
4 rows in set (0.00 sec)
And sure enough, because we grouped our three SQL statements into a single SQL transaction, @@dolt_transaction_commit
created a single Dolt commit, too. We can take a quick look at the diff of that commit using the dolt_diff()
table function and confirm that it contains all three of our new rows. dolt_diff()
takes a starting and ending revision (a commit, branch, or tag) and a table name and returns a result set that describes the rows that changed in that revision range. Note that here, we're using an ancestor spec (~
) to easily refer to the parent commit of the commit we want to diff:
SELECT * FROM dolt_diff('evnjujc7c9cgn8if43eiii76pvavj3fr~', 'evnjujc7c9cgn8if43eiii76pvavj3fr', 'myTable');
+-------+--------------+----------------------------------+-------------------------+---------+-----------+-----------------------------------+-------------------------+-----------+
| to_id | to_name | to_commit | to_commit_date | from_id | from_name | from_commit | from_commit_date | diff_type |
+-------+--------------+----------------------------------+-------------------------+---------+-----------+-----------------------------------+-------------------------+-----------+
| 2 | Dorothy Boyd | evnjujc7c9cgn8if43eiii76pvavj3fr | 2023-06-20 23:59:37.196 | NULL | NULL | evnjujc7c9cgn8if43eiii76pvavj3fr~ | 2023-06-20 23:57:58.647 | added |
| 3 | Rod Tidwell | evnjujc7c9cgn8if43eiii76pvavj3fr | 2023-06-20 23:59:37.196 | NULL | NULL | evnjujc7c9cgn8if43eiii76pvavj3fr~ | 2023-06-20 23:57:58.647 | added |
| 4 | Bob Sugar | evnjujc7c9cgn8if43eiii76pvavj3fr | 2023-06-20 23:59:37.196 | NULL | NULL | evnjujc7c9cgn8if43eiii76pvavj3fr~ | 2023-06-20 23:57:58.647 | added |
+-------+--------------+----------------------------------+-------------------------+---------+-----------+-----------------------------------+-------------------------+-----------+
3 rows in set (0.01 sec)
The results show that in that single commit (evnjujc7c9cgn8if43eiii76pvavj3fr
), three new rows were added to the myTable
table. The diff_type
field shows us that each row was added (not deleted or modified) and the to_id
and to_name
fields show us the values in the new rows. Because these rows were all new, from_id
and from_name
are NULL
, but if this had been a DELETE
or UPDATE
statement, those fields would have the values of the rows before they were deleted or updated.
Conclusion
@@dolt_transaction_commit
is a handy system variable that automatically creates a Dolt commit for every SQL commit, either implicitly through @@autocommit
or by explicitly executing a SQL commit
statement. This enables applications to treat Dolt like any other MySQL database but still have Dolt build a versioned history of your data and schema.
With this setting, even if your system isn't "version-aware", you can still get a lot of benefit from using Dolt as your primary database. With @@dolt_transaction_commit
, any existing application can automatically build a versioned history of its data and schema changes, without having to make any changes to the application code. All of that versioned history of your data is there for you when you need it. Need to log in and run some one-off queries to restore data that was damaged? That's easy with Dolt. Need to audit how all the data managed by a third-party application has changed over time? Just hook it up to Dolt and let Dolt version all the data and schema changes, and you can easily audit how every row of your data has changed since each one was first created.
If you haven't tried out Dolt yet, give it a shot! The @@dolt_transaction_commit
system variable is an easy way to test Dolt with an existing application since you don't have to make any code changes to start building a versioned history. If you have any questions or hit any issues with Dolt or with Dolt's MySQL compatibility, come talk to us on Discord or shoot us an issue on GitHub and we'll be happy to help!