100% Correctness

February 26, 2024

5 min read

Dolt is a SQL database with Git-like versioning features intended to be a drop-in replacement for MySQL. In order to be a drop-in replacement, Dolt needs to match MySQL behavior as closely as possible.

We chose sqllogictests, a collection of around 6 million sql correctness tests, as our correctness benchmark. Over the course of the last 5 years, we've been steadily increasing Dolt's correctness, and today we're proud to announce that we've achieved 100% correctness. This is a huge milestone for us, and we're excited to share it with you.

A Brief History

On October 22, 2019, we first incorporated sqllogictests in our test suite. Initially, we only had ~20% of the tests passing, with a majority of tests failing due to missing features.

A couple months later, on December 17, 2019, we managed to increase our correctness to 90%. The notable additions here were support for the ALL keyword and tables without Primary Keys.

A year and a half later, on June 28, 2021, we reached 99% correctness. We reached this milestone with support for VIEWS, Secondary Indexes, Window Functions, and support for DISTINCT over functions.

Roughly two years after that, on September 6, 2023, we achieved 99.9% correctness. The changes here revolved around aliasing, specifically over joins and lateral join, involving aggregations and using clauses.

A short month later, on October 11, 2023, we have 99.99% correctness. There were a variety of changes here, such as parsing of extra table creation parameters, support for INTERSECT and EXCEPT, and large join performance improvements.

Which brings us to today, where the remaining 0.01% of tests have been fixed, and we have achieved 100% correctness.

Here's a graph to illustrate our progress:

Decimals

Ever since their inception, Decimals have been a pain point for us. Notably, we've struggled to match MySQL's Decimal behavior during rounding, comparison, and conversion.

Depending on the scale (the number of digits after the decimal point) of the Decimals and the type of operation (addition, subtraction, multiplication, division), the resulting scale of the Decimal result would vary. This isn't the first time we tackled this issue; however, this time we've managed to fix the remaining edge cases.

Additionally, these rounding rules actually change when using these decimals in a comparison. For example:

mysql> SELECT 1/3;
+--------+
| 1/3    |
+--------+
| 0.3333 |
+--------+

mysql> SELECT 1/3 = 0.3333;
+--------------+
| 1/3 = 0.3333 |
+--------------+
|            0 |
+--------------+

This is because the result of 1/3 is actually 0.333333333. We can use a CAST() to reveal the full precision.

mysql> SELECT CAST(1/3 AS DECIMAL(65,30));
+----------------------------------+
| CAST(1/3 AS DECIMAL(65,30))      |
+----------------------------------+
| 0.333333333000000000000000000000 |
+----------------------------------+

Furthermore, the scale of the divisor and dividend can affect the scale of the result.

mysql> SELECT 1.0/3.0;
+---------+
| 1.0/3.0 |
+---------+
| 0.33333 |
+---------+

mysql> SELECT CAST(1.0/3.0 AS DECIMAL(65,30));
+----------------------------------+
| CAST(1.0/3.0 AS DECIMAL(65,30))  |
+----------------------------------+
| 0.333333333333333333000000000000 |
+----------------------------------+

After a dividing a bunch of numbers with various scales, we've uncover MySQL's scale comparison rules.

Here's a table to illustrate the rules:

(The bolded numbers are the scales of the divisor and dividend.)

This table boiled down to this code snippet:

...
lScale, rScale := -1*l.Exponent(), -1*r.Exponent()
inc := int32(math.Ceil(float64(lScale+rScale+divPrecInc) / divIntPrecInc))
if lScale != 0 && rScale != 0 {
    lInc := int32(math.Ceil(float64(lScale) / divIntPrecInc))
    rInc := int32(math.Ceil(float64(rScale) / divIntPrecInc))
    inc2 := lInc + rInc
    if inc2 > inc {
        inc = inc2
    }
}
scale := inc * divIntPrecInc
...

In the past, we've had issues with Decimal division performance as well. The solution was to use floating point division under the hood, and then convert the result to a Decimal. This was a performance improvement, but it also introduced a new set of correctness issues. Fortunately, the fixes here actually led to a performance improvement as well.

Here are some benchmarks to illustrate the performance improvements:

Before

BenchmarkDivInt-16                10000  695805 ns/op
BenchmarkDivFloat-16              10000  695044 ns/op
BenchmarkDivHighScaleDecimals-16  10000  694577 ns/op

BenchmarkDivManyInts-16           10000  1151316 ns/op
BenchmarkManyFloats-16            4322   618849  ns/op
BenchmarkDivManyDecimals-16       5721   699095 ns/op

After:

BenchmarkDivInt-16                365416   3117  ns/op
BenchmarkDivFloat-16              1521937  787.7 ns/op
BenchmarkDivHighScaleDecimals-16  294921   3901  ns/op

BenchmarkDivManyInts-16           40711    29372 ns/op
BenchmarkManyFloats-16            174555   6666  ns/op
BenchmarkDivManyDecimals-16       52053    23134 ns/op

In the worst case, we see a 30x improvement, and in the best case we see a 880x improvement!

`GROUP BY` and `HAVING` Aliasing

MySQL's aliasing rules for GROUP BY and HAVING are unique. GROUP BY and HAVING clauses can reference aliases, even if they are duplicates of column names. One would expect these to throw an error, but MySQL lets you off with a warning.

Here's an example:

mysql> SELECT * FROM tab;
+------+------+------+
| col0 | col1 | col2 |
+------+------+------+
|   15 |   61 |   87 |
|   91 |   59 |   79 |
|   92 |   41 |   58 |
+------+------+------+
3 rows in set (0.0005 sec)

mysql> SELECT col2 AS col0 FROM tab GROUP BY col0, col2 HAVING col2 > col0;
+------+
| col0 |
+------+
|   87 |
+------+
1 row in set, 2 warnings (0.0006 sec)
Warning (code 1052): Column 'col0' in group statement is ambiguous
Warning (code 1052): Column 'col0' in having clause is ambiguous

Here col2 is aliased as col0, but the col0 in the GROUP BY and HAVING clauses are still referencing the original col0. Typically, I wouldn't expect people to run into this issue, but it was a part of the sqllogictests suite, so we fixed it.

Joins

A handful of tests were failing due to some troublesome join behavior. Although there were only a few tests failing because of this, tracking down the root cause took a significant amount of time.

Here's one of the smaller examples:

SELECT
    x46, x3, x1, x5, x61, x47, x30, x52, x49, x63, x28, x53
FROM
    t46, t3, t1, t5, t61, t47, t30, t52, t49, t63, t28, t53
WHERE
    1   = a5  AND
    b28 = a5  AND
    b1  = a28 AND
    b53 = a1  AND
    b30 = a53 AND
    b49 = a30 AND
    b47 = a49 AND
    b46 = a47 AND
    b61 = a46 AND
    b3  = a61 AND
    b63 = a3  AND
    b52 = a63

All the failing queries followed this pattern: multiple tables with a single primary key, and a long chain of equality conditions, resulting in one row. Somewhere in our join reorder optimization, we were losing track of one of the AND conditions, resulting in way too many rows being returned.

What's Next?

While there are roughly 6 million queries in sqllogictests, it is still not a comprehensive test suite; there are many edge cases that are not covered. Within our enginetests in the go-mysql-server repo, we have quite a few skipped tests that we need to address. Additionally, sqllogictests were designed to be a correctness benchmark for generic SQL engines, so there are many MySQL specific features that are not covered. Looking at our documentation, there are still quite a few missing functions and operators. Furthermore, there is Dolt specific functionality that isn't covered in sqllogictests that are not covered. You can see our ever growing github issues.

Conclusion

We're proud to have achieved 100% correctness, but we're not stopping here. As always, we're working diligently to improve Dolt's correctness, performance, and feature set. Want to help out? Check out our github and make a contribution. Feel free to reach out to us on Discord.

Blog