Using Perl to Query Dolt Databases
A couple of decades ago, as a brand-new CS grad just entering the world of big tech, I was given the Programming Perl book – also affectionately known as the "Camel Book". 🐪 I was told that I was now officially a Perl programmer and should learn the somewhat eccentric language quickly.
Perl turned out to be a blast to work with, and I'm glad I got the opportunity to learn it. Perl code was fun to write, and it was oddly satisfying to create (or decypher) a compact Perl one-liner. Although Perl's popularity has declined over the years (it's currently ranked #27 on the TIOBE index, just beating out Visual FoxPro), it's still a fun language to use and maintainers continue to evolve and maintain it.
In this blog post, we're going to be hopping on our trusty camel and using Perl to connect to a Dolt database, run queries, and call some of Dolt's version control features. You can find the complete code for this sample online at GitHub if you just want to jump straight to the finished product.
Start up a Dolt Database
Before we get on our camel, let's get a Dolt database server up and running so that we have something to connect to later. Install Dolt, if you don't have it on your system already, and run the commands below to create an empty database and start up a Dolt SQL server.
# This directory name will be used for the database name, so make sure you use "doltdb"
# here, since we'll use that database name in future commands.
mkdir doltdb && cd doltdb
# Just like with git, the "init" subcommand will initialize a directory – in this
# case creating a new Dolt database in the current directory. Bonus points if you
# figure out what the --fun flag does ;-)
dolt init --fun
# Download the sample camel data and load it into our new database
wget https://raw.githubusercontent.com/dolthub/dolt-perl-sample/refs/heads/main/camel_data.sql
dolt sql < camel_data.sql
# Start up a SQL server that we can connect to.
dolt sql-server --loglevel DEBUG
Connect to Dolt from Perl
Now that we've got a running SQL server, let's start writing some Perl code to connect to it. The standard way to access a database using Perl is to use the Perl Database Interface (DBI). DBI provides a common interface for accessing databases and specific databases provide drivers that implement that interface. Since Dolt is MySQL-compatible, we'll be using the DBD::mysql
driver, so make sure you have those dependencies installed on your local system. You can use cpan
to install them:
cpan DBI DBD::mysql
Next, create a file named dolt_with_perl.pl
and put the following code in it to connect to our Dolt database:
use strict;
use warnings;
use DBI;
# Database connection details
my $dsn = "DBI:mysql:database=doltdb;host=localhost;port=3306";
my $username = "root";
my $password = "";
# Connect to the database
my $dbh = DBI->connect($dsn, $username, $password, {
RaiseError => 1, # Automatically die on database errors
AutoCommit => 1, # Automatically commit changes
}) or die "Could not connect to database: $DBI::errstr";
print "Connected to the database successfully.\n";
This code is pretty straightforward. We bring in our DBI dependency and configure some variables with our connection details. Our connection string starts with "DBI:mysql", which tells DBI it needs to find an installed driver that supports that type of database. Since we already installed DBD::mysql
, DBI should find that driver and use it automatically. The most important line in this code is where we call DBI->connect()
, passing in our connection string, username, and password, along with a few other configuration options, such as how to handle any database errors.
You can run this code by executing perl dolt_with_perl.pl
in your terminal. We're passing our script to the perl
interpreter, but you could also add a shebang line and make the script executable if you prefer.
When you run this code, if everything is working correctly, you'll see the following message print out to your screen:
Connected to the database successfully.
Querying Dolt with Perl
Now that we've got a simple Perl script that connects to our Dolt database, let's add some code to query our camel data.
Add the following code to the end of your dolt_with_perl.pl
file:
# Query 1: Fetch some camel data
my $select_query = qq{
SELECT * FROM camels ORDER BY age ASC
};
my $sth_select = $dbh->prepare($select_query);
$sth_select->execute() or die "Select failed: $DBI::errstr";
print "Camels:\n";
while (my @row = $sth_select->fetchrow_array) {
print "CamelID: $row[0], Name: $row[1], Age: $row[2], Species: $row[3], FavoriteFood: $row[4], Trivia: $row[5]\n";
}
This code shows how we take a SQL query, use our database connection ($dbh
) to prepare it, and then execute it. To read the result set, we loop over calling the fetchrow_array
method, which fetches the next row and assigns it to the array @row
. Then inside our loop, we just access each field of the array to print out the field values.
After adding this new code to our script, when you run perl dolt_with_perl.pl
, you should see:
Camels:
CamelID: 7876a999-7a7b-4cc4-80ee-9ac7d062dc05, Name: Chewy, Age: 4, Species: Dromedary, FavoriteFood: Cabbage, Trivia: Camels can walk up to 100 miles without water.
CamelID: a2eee5e2-9e0a-454a-b30f-6630b7d5efd1, Name: Sandy, Age: 5, Species: Wild Bactrian, FavoriteFood: Cactus, Trivia: Wild Bactrian camels are critically endangered.
CamelID: e4a1c858-f5e3-483c-a880-2f1c8bffb5ca, Name: Rajah, Age: 7, Species: Dromedary, FavoriteFood: Spinach, Trivia: Camels have a third eyelid to protect against sand.
CamelID: 5ba48022-8672-43bb-958b-d6950d5ce3c2, Name: Bianca, Age: 8, Species: Bactrian, FavoriteFood: Carrots, Trivia: Bactrian camels have two humps.
CamelID: 7319da72-156c-4c62-85ee-30270f940b41, Name: Dusty, Age: 9, Species: Wild Bactrian, FavoriteFood: Grass, Trivia: Wild Bactrian camels can survive in extreme desert conditions.
CamelID: d8a0e5cf-e384-4e76-881a-e5a638bbe04d, Name: Mona, Age: 10, Species: Bactrian, FavoriteFood: Barley, Trivia: Camels can drink up to 40 gallons of water in one go.
CamelID: 77115938-a39d-467c-bbda-a44655f96ced, Name: Casper, Age: 12, Species: Dromedary, FavoriteFood: Dates, Trivia: Dromedary camels have one hump.
CamelID: 1948df94-e0d0-4008-b0db-383d39af47da, Name: Luna, Age: 14, Species: Bactrian, FavoriteFood: Apples, Trivia: Camels have been domesticated for over 3,000 years.
CamelID: 71237616-9ca7-4617-a8d1-daa90c5b1d2b, Name: Humphrey, Age: 15, Species: Dromedary, FavoriteFood: Alfalfa, Trivia: Camels can close their nostrils during sandstorms.
Inserting More Rows
To insert a new row, we reuse most of the same code from above, except this time we don't have to iterate over a result set. Add the following code to the end of your dolt_with_perl.pl
file to insert a new camel into the database:
# Query 2: Insert a new camel
my $insert_query = qq{
insert into Camels values (DEFAULT, 'Topsy', 100, 'Bactrian', 'Popcorn', 'Topsy retired to LA after a career as a soldier, a builder, a miner, and a movie star.')
};
my $sth_insert = $dbh->prepare($insert_query);
$sth_insert->execute() or die "Insert failed: $DBI::errstr";
print "New camel, Topsy, added\n";
This code should seem very familiar after reading the previous section. We prepare our statement, execute it, and since there are no results returned from an INSERT
statement, we just print out a message stating that the row was inserted.
Version Control with Dolt
So far, we've seen how easy it is to connect to a Dolt database from Perl using a MySQL driver, but everything we've seen so far could be done with a regular MySQL database. Let's take a look at some queries that leverage Dolt's version-control features.
Say we want to look through the history of our database and identify any camels that used to be in the dataset, but aren't anymore. With a regular MySQL database, this just isn't possible. You can see the current state of the database, but you can't query against the history. With Dolt, however, it's easy to query the history of your data, inspect how your data has changed, and audit when data was added, updated, or removed.
To answer this question, we're going to use the dolt_diff_camels
system table. For every user table in the database, Dolt creates a dolt_diff_<tablename>
system table that can be used to query the log of changes to a specific table. The schema of the dolt_diff_<tablename>
system tables includes all the columns from the user table prepended with from_
to represent the before state of the row, and all the columns of the user table prepended with to_
to represent the after state of the row, along with a few other columns that provide metadata about the change, such as whether it was an add, remove, or delete change, and the before and after commit IDs.
To find our canceled camels, we can search dolt_diff_camels
for all the camels that have been removed (i.e. diff_type='removed'
). We can join that query onto the dolt_commits
table to grab the commit message for each camel removal, too. The query we'll run looks like this:
select from_name, message from dolt_diff_camels join dolt_commits on to_commit = commit_hash where diff_type='removed';
Add this code to the end of your dolt_with_perl.pl
file to run this new query as part of our script:
# Query 3: Query camel history
my $history_query = qq{
select from_name, message from dolt_diff_camels join dolt_commits on to_commit = commit_hash where diff_type='removed'
};
my $sth_history = $dbh->prepare($history_query);
$sth_history->execute() or die "Select failed: $DBI::errstr";
print "Removed camels:\n";
while (my @row = $sth_history->fetchrow_array) {
print "Name: $row[0], Reason: $row[1]\n";
}
print "\n";
When you run this code with perl dolt_with_perl.pl
, you should see the same output from the previous sections, followed by the new list of camels that have been removed from the database:
Removed camels:
Name: Blitzen, Reason: Removing a reindeer that tried to sneak in with the camels
Aha! It looks like Blitzen tried to sneak in with the camels, but he was caught and removed from the database. With a regular database, we wouldn't have been able to see this without resorting to scanning through binary query logs and tediously looking for delete queries, but with Dolt, all of this information about our data's history is at our fingertips and easily queryable.
Cleaning Up
Last, but not least, we need to remember to clean up any resources we've allocated as part of querying a remote database. That means releasing the database connection and statements when we're done with them. Add the following code to the end of your dolt_with_perl.pl
file to close the statement handles and disconnect from the database:
# Clean up
$sth_select->finish();
$sth_insert->finish();
$sth_history->finish();
$dbh->disconnect();
print "Disconnected from the database.\n";
Summary
Although Perl may not be as popular as it was back in the mid 1990s, it's still a great language to work with. It's fun to write Perl code, and there's no shortage of great libraries and modules available to use. Perl's Database Interface (DBI) module provides great support for working with databases, and the MySQL driver works seamlessly for connecting to Dolt databases. You can find the full source code for this sample project on GitHub in the dolthub/dolt-perl-sample repository if you want to take Dolt for a spin with Perl.
If you want to talk about Perl, database development, or version control, come join us on the DoltHub Discord server! We're always happy to help people get started with Dolt and talk about how Dolt can make it easier and safer to manage versioned data.