Dolt + Excel
Dolt, DoltHub, and Hosted Dolt can be used for Spreadsheet Version Control. Many of our customers migrated to Dolt from a really complicated, heavily edited spreadsheet. Earlier this year, I wrote a blog about how to use Dolt with Google Sheets.
I've been waiting for a solid chunk of time I could work from home on my Windows machine to produce a companion Excel article because as we all know Windows Excel is the most popular Excel. Well, it being Thanksgiving week here in the US, I was able to work from home and produce this gem. Enjoy!
I first tested Dolt against Excel in fall of 2020 working from home during the pandemic. Dolt has changed a lot since then. Integration with Excel has never been better.
Why Dolt and Excel
There is history in Excel using the "Show Changes" functionality. There is collaborative editing in Excel when combined with Sharepoint or when using the online version. But as I explain in my Spreadsheet Version Control article (with a slight edit), history and collaborative editing is not version control.
Version control allows for branch and merge, the ability to asynchronously collaborate on multiple versions in an organized way. Instead of everyone editing the same
Google SheetExcel Workbook, you can create you own copy, make edits, see the differences (ie. diffs) between your copy and another copy, and then merge two copies together. That workflow is true version control. No spreadsheet has those features.
Moreover, by adding DoltHub or Hosted Dolt to the mix, you add a Pull Request workflow. Pull Requests allow for human review of edits. Human review helps ensure data quality of changes to your spreadsheets.
Goals
To show off how Dolt can work with Excel, the goals are enumerated below.
- Read a Dolt table from the
main
branch into an Excel sheet - Write to Dolt on a branch named
branch
using VBA - Review differences in an Excel Sheet
- Create a pull request on Hosted Dolt from
branch
tomain
- Perform a human review of the pull request
- Merge the changes into the
main
branch - Read the new merged Dolt table into the original Excel Sheet
Setup
Dolt
For this demo, we created a very simple Dolt database named excel
. The database has one table named test
. The test
table has two columns, id
and words
. id
is a varchar(36)
with a UUID default and the primary key. As described in this article, this is the best way to define a auto-generating primary key. words
is a string of max size 255 characters.
Additionally, we create two branches main
and branch
and seeded some test data on both branches.
Hosted Dolt
We create the database on Hosted Dolt because this is the easiest way to show off Excel connectivity. You can find the database named excel
here. You will need to be granted permissions to see it. Either email me at tim@dolthub.com or come by our Discord and I'll give you access. Screenshots of the experience can be found below.
Hosted Dolt is a running Online Transaction Protocol (OLTP) database. You connect to it over the internet using any MySQL client. Hosted Dolt comes with a built in SQL Workbench which can be used to complement Excel for users more comfortable in a GUI environment.
Read using ODBC
The way to connect Excel to SQL databases is through the appropriate Open Database Connectivity (ODBC) Connector. Dolt is a MySQL-compatible database so you use the MySQL ODBC Connector.
Install the MySQL ODBC Connector on your computer. Downloading the appropriate installer and running it opens a standard Windows Installer. Once complete, navigate to Control Panel
> Administrative Tools
> ODBC Data Sources
or just search for ODBC Data Sources
. You'll be greeted by this window:
Click Add...
and you should see MySQL ODBC 8.0 ANSI Driver
and MySQL ODBC 8.0 Unicode Driver
, select the MySQL ODBC 8.0 ANSI Driver
and click Finish
. Now you need to enter your connection information. We call the data source "Dolt Excel", leave the server on TCP and enter timsehn-excel.dbs.hosted.doltdb.com
as the server name, enter the user c6mc26245p2irqzn
, enter the password as DQ0GSINxBtpNdbJlM3CDMoB1UwITXPHA
, and the database as excel
.
After entering that information, clicking Test
should result in a Connection Successful
dialog. Click OK
and the new Dolt Test ODBC connector will appear in the list.
Now it's time to read the data into Excel over that connection. Open a new Excel Workbook and navigate to Data
> Get Data
> From Other Sources
> From ODBC
. Pick Dolt Excel
from the drop down and click OK
. Excel requires you enter the username and password again so enter c6mc26245p2irqzn
as the username and DQ0GSINxBtpNdbJlM3CDMoB1UwITXPHA
as the password. This will bring up the following preview screen where you can navigate to and pick the table you want to import.
I selected the excel
database and test
table. I am greeted with this nice data preview.
From here we click "Load" to import the test
table into a new sheet named test
. I rename the sheet main
.
Now, we want to read the test
table off of a branch. To do this we need to use an as of
query. If you want to return the results of a query instead of a whole table, navigate to Data
> Get Data
> From Other Sources
> From ODBC
. Pick Dolt Excel
from the drop down and click Advanced
instead of OK
. Enter the SQL query select * from test as of 'branch'
in the text box.
This brings up a data preview and you click "Load" to load the data into a new sheet. I rename that sheet "branch".
Finally, we want to show the diff between the two branches on a final sheet. To do this we follow the same steps as above to execute a custom SQL query. Dolt exposes version control read functionality as custom functions and procedures. To produce a diff between two branches we are going to use the dolt_diff() function. This function takes the two branches and table name as arguments. Enter select * from dolt_diff('main', 'branch', 'test')
into the SQL query box, view the preview and click "Load". I called this sheet "diff".
As you can see, there is a new row added on the branch
branch.
Interacting with a running Hosted Dolt with Excel is easy on the read path. Everything is accessible via an manageable Graphical User Interface (GUI). Use SQL queries to make custom sheets and use the "Queries and Connections" menu to refresh the data whenever you would like.
Write Using VBA
Things get a little tricker when you want to write to Hosted Dolt from an Excel sheet. There is no built-in Graphical User Interface to do this. You must use Visual Basic for Applications (VBA).
To use VBA in Excel, you first need to enable the Developer Tab. This can be done on the File
tab by going to Options
> Customize Ribbon
and select the Developer
checkbox. Once you've enabled it, click on it and then click Visual Basic. You'll be greeted with a new Window that looks like a code editor.
Next you must enable the "Microsoft ActiveX Data Objects (ADO) Library". In the VBA window, click on Tools
> References
. There's a bunch of them. The tutorial I worked with used 2.8 so I chose that one.
Now, it's time to write some code. I'm going to start simple because I'm not super familiar with VBA. Let's connect to Hosted Dolt and write a row on the branch
branch. I enter the database as excel/branch
so I connect to the correct Dolt branch to make the write. I also make sure to create a Dolt commit after I perform the insert. This is because my branch sheet only gets committed changes because of my use of as of
.
Dim oConn As ADODB.Connection
Dim rs As ADODB.Recordset
Private Sub writeToDolt()
Set oConn = New ADODB.Connection
oConn.Open "DRIVER={MySQL ODBC 8.0 ANSI Driver};" & _
"SERVER=timsehn-excel.dbs.hosted.doltdb.com;" & _
"DATABASE=excel/branch;" & _
"USER=c6mc26245p2irqzn;" & _
"PASSWORD=DQ0GSINxBtpNdbJlM3CDMoB1UwITXPHA;" & _
"Option=3"
Set rs = New ADODB.Recordset
strSQL = "insert into test(words) values ('VBA can make words')"
rs.Open strSQL, oConn, adOpenDynamic, adLockOptimistic
strSQL = "call dolt_commit('-am', 'VBA first commit')"
rs.Open strSQL, oConn, adOpenDynamic, adLockOptimistic
End Sub
When I click the play button for the writeToDolt()
function, go back to my branch sheet, and refresh using queries and connections, voila the data has been inserted!
We can also see the change of the diff sheet if we refresh. This could be used to review changes locally.
You can imagine making this code arbitrarily complicated, taking edits from the branch sheet itself and, inserting or updating the appropriate rows in Dolt. I have to be honest, I tried to do this for an afternoon but my VBA hacking skills were lacking. I did not want to work on Thanksgiving so I'll leave that as an exercise for the Reader. Major cop out, I know.
Hosted Dolt Pull Requests
I've written a new row and already had a branch row inserted. Let's go to Hosted Dolt, make a Pull Request and have those changes reviewed. I select main
as my base branch and branch
as my from
branch.
I click "Create pull request" and now I can have my changes reviewed. I send a link to this page to a colleague.
My colleague can view the diff on Hosted Dolt to review what I actually changed.
Everything looks fine so I click merge. The merge succeeds. Now, I can go back to Excel and make sure my changes showed up on the main
branch.
As expected, my reviewed changes have been merged.
Conclusion
Dolt's integration with Excel has never been better. Hosted Dolt can be used to add a Pull Request Workflow to your Excel Spreadsheet. Interested in adding version control to your complicated spreadsheet, come by our Discord and we'll help you get started.