How to Restore a Dropped Database

USE CASE
7 min read

Dolt is the world's first version-controlled relational database. It gives you all the power of Git to branch, fork, merge, diff, and even rebase your relational data. When you use Dolt, you have confidence that your data is safe – even if it is accidentally changed or deleted, you can easily audit to find out what happened and revert the changes. Even if you accidentally call drop database, Dolt lets you undo that and restore your database! In today's blog post, we're sharing how the ability to "undrop" a Dolt database helped a customer recover from an accident.

Restoring a Dropped Database

We added support for the dolt_undrop() stored procedure last year, and we recently had an opportunity to help a customer use this feature to restore a database after an accident. This was exactly why we built the dolt_undrop() stored procedure, so it was satisfying to see it working when a customer needed it and to see how it helped them easily recover from a mistake. In this case, the customer was running some scripts on what was supposed to have been a read-only database account, but the account actually had write permissions and ended up dropping one of the main databases on the Dolt server. Fortunately, this was in a pre-production environment, so there was no direct impact for their customers, although internal teams who relied on that database were impacted.

Accidents like this are bound to happen from time to time in any software system. You don't have to look far on the internet to find cautionary tales about silently failing backup processes that prevented teams from being able to restore lost data. One of my favorites is the tale about how we almost lost Toy Story 2! 🙀

So, instead of trying to prevent any accident from ever happening, it's more important to be prepared to handle one when it does occur. We have two big recommendations for teams on how to be prepared for major accidents like this:

  • Include the process for using dolt_undrop() in your operational runbooks on what to do if a database is accidentally dropped. dolt_undrop() is a custom stored procedure for Dolt and isn't normally used during development, so people may not know that it exists. Documenting it in your team's operational runbook is a good way to help people find it quickly when they need it.
  • Have a regular cadence for testing your database backups and the restoration process. Backups are critical for any database system. Even though Dolt has a ton of safety features and ways to access any previous versions of your data, you still need to use backups as an extra layer of safety in case anything truly catastrophic happens, such as a complete host or disk failure. And in case it's not already obvious, if you aren't testing your backup restore process, then you don't know for sure that your backups are going to work when you need them.

Using dolt_undrop()

Let's take a closer look at the dolt_undrop() stored procedure and see how it works...

To get a database to work with, we're pulling down the somewhat famous post-no-preference/stocks database from DoltHub. This database holds over a decade of daily stock price information for around 16 thousand stock symbols.

Once you have dolt installed locally, you can clone the post-no-preference/stocks database by running the command below:

dolt clone post-no-preference/stocks

Then, move into the new stocks directory and start up a Dolt SQL shell:

cd stocks
dolt sql 

From here, you can imagine how we could add our own data and analysis to our local copy of this database. For example, we could bring in a bunch of other information to correlate with stock activity and start building some reports, but then... disaster strikes and our database gets accidentally dropped by a bug in our data processing script!

Cat is panicing because he dropped the production database

drop database stocks;
Query OK, 1 row affected (0.03 sec)

show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
+--------------------+

With a traditional relational database, there is no built-in, default way to reverse a call to drop database. If you're lucky and you have working backups, then you've got to go find your most recent backup, hope that the backup restoration process is documented and actually works, and then deal with any data that was lost between when that backup was taken and when the database was dropped. It's likely not going to be an easy, quick, or painless experience for you, but if you're lucky, you won't lose too much data.

With Dolt, you only need to run one command to undo the drop database statement and restore your database. You don't need to go look for backups, and you don't have to deal with data missing from the backup.

call dolt_undrop('stocks');
+--------+
| status |
+--------+
| 0      |
+--------+
1 row in set (0.04 sec)

> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| stocks             |
+--------------------+

And presto! The stocks database is back at the exact same commit it was at before it was dropped.

A More Complex Restore

Now let's look at a slightly more complex example and see what actually happened in the real world. When the customer accidentally dropped their database, they weren't aware of the dolt_undrop() stored procedure, so they tried recreating a new database with the same name to try and get internal apps to start working again. In this case, dolt_undrop() can still help us restore our data, but we need to do a little more work to get there.

To understand what happened and how to restore a dropped database after a new database with the same name has already been created, we need to look at the internals of our Dolt database's directory.

Let's start by dropping the stocks database again – don't worry, we're not going to lose it!

drop database stocks;
Query OK, 1 row affected (0.03 sec)

show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
+--------------------+

Now let's create a new database with the same name, stocks:

create database stocks;

show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| stocks             |
+--------------------+

At this point, if we call dolt_undrop('stocks');, we'll get an error, because there's already is a database with the name stocks and Dolt doesn't want to overwrite it. Instead, we need to drop the new stocks database and somehow restore the old one. Dolt is smart enough to keep multiple versions of dropped databases around, but we'll need to dig around on the filesystem to find the right version to restore.

Go ahead and drop the new stocks database:

drop database stocks;

Now let's go look in Dolt's data directory and take a look around. The data directory is typically the same directory where you launch dolt, although this can be configured through the --data-dir parameter, too. Since we launched dolt sql in the stocks directory, that's the data directory that Dolt is using.

When you drop a database with Dolt, it gets moved into a hidden .dolt_dropped_databases directory inside of Dolt's data dir. Your dropped database is kept around here until you decide to permanently delete it using the dolt_purge_dropped_databases() stored procedure. If a database is dropped multiple times, Dolt renames the prior version by adding a backup suffix and a timestamp. Here's what my .dolt_dropped_databases directory looks like after dropping the stocks database twice:

ls -l .dolt_dropped_databases 
drwxr-xr-x  3 jason  staff  96 Sep 26 17:33 stocks
drwxr-xr-x  3 jason  staff  96 Sep 26 14:05 stocks.backup.1727397835485

Notice that there are two stocks databases that have been dropped. The most recently dropped stocks database is simply named stocks, and the older version has been renamed stocks.backup.1727397835485. When we call dolt_undrop(), the name of the directory is used for the restored database name, and since we want our database to be named stocks, we need to rename the directory before we call dolt_undrop().

cd .dolt_dropped_databases
mv stocks stocks.new
mv stocks.backup.1727397835485 stocks

Now, back in our Dolt SQL shell, we can run dolt_undrop('stocks'); to restore the original stocks database:

call dolt_undrop('stocks');
+--------+
| status |
+--------+
| 0      |
+--------+

show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| stocks             |
+--------------------+

We can see we've got a stocks database now, but let's take a closer look and make sure it's the correct, original stocks database, and not the empty stocks database that we created a moment ago. A good way to do this is to look at a couple of the most recent commits in the dolt_log system table and see if the commits and commit dates match up to what you're expecting.

use stocks;
Database changed
         
select * from dolt_log limit 5;
+----------------------------------+--------------------+-----------------------------------+---------------------+--------------------------+
| commit_hash                      | committer          | email                             | date                | message                  |
+----------------------------------+--------------------+-----------------------------------+---------------------+--------------------------+
| mlhdjntsm3mb4pjufrni05khb6j77sh4 | post-no-preference | post.no.preference@protonmail.com | 2024-09-26 05:30:29 | ohlcv 2024-09-25 update  |
| v8egsd5p1c9n7mohh738loni8vq2bnt9 | post-no-preference | post.no.preference@protonmail.com | 2024-09-25 05:30:29 | ohlcv 2024-09-24 update  |
| e93aqoi0njh331bjaqkd4ihc2r2cl5rc | post-no-preference | post.no.preference@protonmail.com | 2024-09-24 05:30:28 | ohlcv 2024-09-23 update  |
| kiaj5k2375s9l43ad0btfvmddgf0avqp | post-no-preference | post.no.preference@protonmail.com | 2024-09-23 09:00:13 | symbol 2024-09-22 update |
| 20g3pf4l7i7ss10u56mm6uu87m7p5jdv | post-no-preference | post.no.preference@protonmail.com | 2024-09-22 10:54:47 | split 2024-09-22 update  |
+----------------------------------+--------------------+-----------------------------------+---------------------+--------------------------+

And sure enough, we see the most recent commit was created today, so we know we've correctly restored the original stocks database.

Summary

Dolt is an *extremely safe place to keep your data. With Dolt, all previous versions of your data are still accessible, so you can easily audit changes and see exactly when every value in your database has changed and who changed it. The ability to restore a dropped database with dolt_undrop() is just one of the many tools Dolt provides to help you keep your data safe. Other features, like the Dolt reflog and the dolt_revert() stored procedure are useful features to help you recover data when things go wrong.

We're always looking for ways to make Dolt safer and easier to use. If you want to talk to us about data safety, version control, or anything else, come by and join us, and other Dolt customers, on the DoltHub Discord server.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.