Dolt is a relational database that uses a custom storage layer and SQL engine to version data with Git-semantics. Every layer in the stack relies on thousands of tests to maintain MySQL compatibility and correctness. The surface area of MySQL and Git features is large, and testing is something we pay close attention to.
SQL engine tests are a team favorite. They comprise sets of query strings that validate SQL input/output pairs. They are conceptually easy, convenient to run, coverage-thorough, and provide tight iteration cycles. As a result, we have added new engine tests every day for years. This graph shows the amount of seconds taken to run enginetests over the past 12 months:
The uptick has started become a developer bottleneck. A dev waits 4 minutes for enginetests to pass before committing any changes to go-mysql-server (GMS), our open source SQL engine, or Dolt.
Today we will show how we slashed the engine test runtime in the most recent release by 90% using Dolt versioning.
Dolt Stored procedures
Consider TestInsertInto
, which adds rows to a table before verifying a
new row is added:
{
WriteQuery: "INSERT INTO mytable SET s = 'x', i = 999;",
ExpectedWriteResult: []sql.Row{{sql.NewOkResult(1)}},
SelectQuery: "SELECT i FROM mytable WHERE s = 'x';",
ExpectedSelect: []sql.Row{{int64(999)}},
},
The profile of the test suite is distressing, but obvious in hindsight:
. . 709:func TestInsertInto(t *testing.T, harness Harness) {
. . 710: for _, insertion := range InsertQueries {
. 3.48s 711: e := NewEngine(t, harness)
. 20ms 714: TestQuery(t, harness, e, insertion.WriteQuery, insertion.ExpectedWriteResult, nil)
. 20ms 715: TestQuery(t, harness, e, insertion.SelectQuery, insertion.ExpectedSelect, nil)
. . 716: }
. . 716:}
The test creates a SQL engine populated with data, runs
WriteQuery
, and then runs SelectQuery
. Engine setup dwarfs the
test runtime. The actual test is fast (40 milliseconds), rebuilding the
database every time is slow (3480 milliseconds).
Reduce, Reuse, Recycle
How do Dolt stored procedures help us with enginetests? We can recycle a database with two commands that reset preexisting and new tables instead of running the entire setup steps every time.
First, dolt reset --hard
rewinds uncommitted table changes. Below
we show the table data alongside the dolt_status
describing
working set changes:
-- Observe the new changes
> select * from mytable;
+-----+---+
| i | s |
+-----+---+
| 999 | x |
+-----+---+
> select * from dolt_status;
+------------+--------+----------+
| table_name | staged | status |
+------------+--------+----------+
| mytable | false | modified |
+------------+--------+----------+
-- Rewind changes
> call dolt_reset('--hard');
+--------+
| status |
+--------+
| 0 |
+--------+
-- Database state is equal to the last commit
> select * from mytable;
+---+---+
| i | s |
+---+---+
+---+---+
> select * from dolt_status;
+------------+--------+--------+
| table_name | staged | status |
+------------+--------+--------+
+------------+--------+--------+
Resetting the database does not entirely correct the working set. Reset only affects previously versioned tables.
Consider TestCreateTableSelect
, where we add a new table to the
database by selecting a fraction of an existing table:
{
WriteQuery: "CREATE TABLE newtable AS SELECT * from mytable;",
ExpectedWriteResult: []sql.Row{{sql.NewOkResult(0)}},
SelectQuery: "SELECT i FROM newtable WHERE s = 'first row';",
ExpectedSelect: []sql.Row{{int64(1)}},
},
After running dolt reset --hard
, the new table remains:
-- Observe new table
> show tables;
+---------------+
| Tables_in_tmp |
+---------------+
| mytable |
| newtable |
+---------------+
> select * from dolt_status;
+------------+--------+-----------+
| table_name | staged | status |
+------------+--------+-----------+
| newtable | false | new table |
+------------+--------+-----------+
-- Attempt to remove the new table
> call dolt_reset('--hard');
+--------+
| status |
+--------+
| 0 |
+--------+
-- New table is still there
> show tables;
+---------------+
| Tables_in_tmp |
+---------------+
| mytable |
| newtable |
+---------------+
> select * from dolt_status;
+------------+--------+-----------+
| table_name | staged | status |
+------------+--------+-----------+
| newtable | false | new table |
+------------+--------+-----------+
Reset prevents Dolt from deleting unversioned tables.
In order to remove untracked changes from the working set, I implemented
dolt clean
.
-- Delete the new table
> call dolt_clean('newtable');
+--------+
| status |
+--------+
| 0 |
+--------+
-- Database state is equal to the last commit
> select * from dolt_status;
+------------+--------+--------+
| table_name | staged | status |
+------------+--------+--------+
+------------+--------+--------+
> show tables:
+---------------+
| Tables_in_tmp |
+---------------+
| mytable |
+---------------+
Dolt clean removes new, unstaged, and untracked tables. Or put simply,
dolt clean calls drop table
for new tables, restoring the second half
of the working set that reset
preserves.
Results
The reset and clean combo restores modified tables and deletes
new tables, respectively, recycling a SQL engine. Here is a TestInsertInto
profile with recycling:
. . 637: t.Run(tt.WriteQuery, func(t *testing.T) {
. 270ms 638: e := mustNewEngine(t, harness)
. . 639: ctx := NewContext(harness)
. 20ms 641: TestQueryWithContext(t, ctx, e, tt.WriteQuery, tt.ExpectedWriteResult, nil, nil)
. 30ms 642: TestQueryWithContext(t, ctx, e, tt.SelectQuery, tt.ExpectedSelect, nil, nil)
. . 643: })
A reasonable initial response to this profile might be “This is terrible! 90% of the runtime is still spent on setup.” But remember, we went from 3000 milliseconds to 300 milliseconds. Here is the before and after at the scale developers see for all tests:
> go test github.com/dolthub/dolt/go/libraries/doltcore/sqle/enginetest -count=1
ok github.com/dolthub/dolt/go/libraries/doltcore/sqle/enginetest 246.408s
ok github.com/dolthub/dolt/go/libraries/doltcore/sqle/enginetest 30.254s
There is room for improvement, but overall we achieved a 90% speedup.
A developer might run go test ./...
anywhere from one to hundreds
of times for every PR. 4 minutes * 100 ~= a day, 40 seconds *
100 ~= an hour.
Next time we will talk about our CI shell script tests, bats. Bats are rarely run during active development, but remain a bottleneck for clearing GitHub PRs for merge. Here is a final side-by-side for comparison:
bats tests | engine tests | |
---|---|---|
runtime | 26 minutes | 30 seconds |
Summary
We care about shipping software at Dolt, but every PR accumulates tests. Over time, the weight of tests themselves become a drag on development.
This blog shows how we narrowed the test feedback cycle using Dolt stored procedures. Database native Git-style versioning lets us recycle our test setup so we can do more coding and less waiting on tests. If you need to mock a MySQL database for your test suite. Dolt might be able to speed up your tests as well.
If you have any questions about Dolt, databases, or Golang performance reach out to us on Twitter, Discord, and GitHub!