Correctness Update

SQL
5 min read

About a month ago, we provided an update to Dolt's correctness. Since then, we've continued to make improvements; here's what we've been up to.

SQLAcidTests

A user on our Discord pointed our attention towards SQLAcidTests, which is essentially a collection of tests for SQL databases. The tests center around compliance and convention, ensuring that the database adheres to the SQL standard. Apparently, MySQL barely passes any of these tests. Naturally, in our quest for correctness, we decided to run these tests against Dolt...

It was not great.

$ python test_binary.py --program dolt --extra "sql"
tests/compliance/test008.sql   -   Error
tests/compliance/test014.sql   -   Success
tests/compliance/test015.sql   -   Error
tests/compliance/test016.sql   -   Error
tests/compliance/test017.sql   -   Error
tests/compliance/test018.sql   -   Error
tests/compliance/test019.sql   -   Error
tests/compliance/test020.sql   -   Error
tests/compliance/test021.sql   -   Error
tests/compliance/test022.sql   -   Error
tests/compliance/test023.sql   -   Error
tests/compliance/test024.sql   -   Error
tests/compliance/test025.sql   -   Error
tests/compliance/test027.sql   -   Error
tests/convention/test001.sql   -   Error
tests/convention/test002.sql   -   Error
tests/convention/test003.sql   -   Error
tests/convention/test004.sql   -   Success
tests/convention/test005.sql   -   Error
tests/convention/test006.sql   -   Error
tests/convention/test007.sql   -   Error
tests/convention/test009.sql   -   Error
tests/convention/test010.sql   -   Error
tests/convention/test011.sql   -   Error
tests/convention/test012.sql   -   Error
tests/convention/test013.sql   -   Error
tests/convention/test026.sql   -   Error

Syntax Errors

Upon further investigation, a majority of the tests failed due to syntax errors. More specifically, the tests were using an inline table constructor that was specific to PostgreSQL.

... (VALUES (...), (...), ...) <tblAlias> ...

Dolt and MySQL support a very similar variant of this syntax.

... (VALUES ROW(...), ROW(...), ...) <tblAlias> ...

Since the only difference is the ROW keyword, we decided to add support for the PostgreSQL variant. This significantly increased the number of tests we passed.

$ python test_binary.py --program dolt --extra "sql"
tests/convention\test001.sql   -   Success
tests/convention\test002.sql   -   Success
tests/convention\test003.sql   -   Error
tests/convention\test004.sql   -   Success
tests/convention\test005.sql   -   Success
tests/convention\test006.sql   -   Success
tests/convention\test007.sql   -   Success
tests/convention\test009.sql   -   Success
tests/convention\test010.sql   -   Fail
tests/convention\test011.sql   -   Success
tests/convention\test012.sql   -   Success
tests/convention\test013.sql   -   Error
tests/convention\test026.sql   -   Success
tests/compliance\test008.sql   -   Success
tests/compliance\test014.sql   -   Success
tests/compliance\test015.sql   -   Success
tests/compliance\test016.sql   -   Error
tests/compliance\test017.sql   -   Success
tests/compliance\test018.sql   -   Success
tests/compliance\test019.sql   -   Success
tests/compliance\test020.sql   -   Error
tests/compliance\test021.sql   -   Success
tests/compliance\test022.sql   -   Error
tests/compliance\test023.sql   -   Error
tests/compliance\test024.sql   -   Success
tests/compliance\test025.sql   -   Error
tests/compliance\test027.sql   -   Error

The remaining tests were rewritten to valid MySQL queries (the the best of out abilities), and then reran on Dolt. A few of these tests were definitely bugs in Dolt, while a couple were just differences in behavior between MySQL and PostgreSQL. Here a quick patch notes summary of the fixes we made.

Fixes:

Known Bugs:

Won't Fixes (not supported in MySQL):

  • Support for FULL OUTER JOIN
  • Recursive CTE in subquery
  • || operator is concatenation instead of logical OR (supported in MySQL through SQL_MODE)

Final Results:

$ python test_binary.py --program dolt --extra "sql"
tests/compliance\test008.sql   -   Skip
tests/compliance\test014.sql   -   Success
tests/compliance\test015.sql   -   Success
tests/compliance\test016.sql   -   Success
tests/compliance\test017.sql   -   Success
tests/compliance\test018.sql   -   Success
tests/compliance\test019.sql   -   Success
tests/compliance\test020.sql   -   Skip
tests/compliance\test021.sql   -   Success
tests/compliance\test022.sql   -   Success
tests/compliance\test023.sql   -   Skip
tests/compliance\test024.sql   -   Success
tests/compliance\test025.sql   -   Success
tests/compliance\test027.sql   -   Success
tests/convention\test001.sql   -   Success
tests/convention\test002.sql   -   Success
tests/convention\test003.sql   -   Error
tests/convention\test004.sql   -   Success
tests/convention\test005.sql   -   Skip
tests/convention\test006.sql   -   Success
tests/convention\test007.sql   -   Success
tests/convention\test009.sql   -   Success
tests/convention\test010.sql   -   Fail
tests/convention\test011.sql   -   Success
tests/convention\test012.sql   -   Success
tests/convention\test013.sql   -   Error
tests/convention\test026.sql   -   Success

SELECT INTO with "incompatible" schema

A customer using WooCommerce, ran into an issue when trying to use SELECT INTO to create a new table.

cannot convert type longtext COLLATE utf8mb4_0900_ai_ci to datetime

It turns out we were unnecessarily strict when checking the schema compatibility between the source and destination tables. MySQL doesn't check the schema compatibility, but rather checks if individual values can be coerced into the destination columns. This specific error was thrown, because we were attempting to convert an empty string to a DATETIME column, which is invalid. This has been fixed in the latest version of dolt:

tmp/main> create table t1 (i int primary key, t text);
tmp/main*> insert into t1 values (1, '2001-01-01'), (2, 'badtime'), (3, '');
Query OK, 3 rows affected (0.01 sec)
tmp/main*> create table t2 (d datetime);
tmp/main*> insert into t2(d) select t from t1 where i = 3;
Incorrect datetime value: ''
tmp/main*> insert into t2(d) select t from t1 where i = 2;
Incorrect datetime value: 'badtime'
tmp/main*> insert into t2(d) select t from t1 where i = 1;
Query OK, 1 row affected (0.03 sec)

Metrics Summary

We had 195 skipped tests out of 42347 total enginetests. Now, we have 204 skipped tests out of 43215 total enginetests. This yields a 99.53 percent pass rate.

This time around, we haven't added any new functions to our supported functions list. We're still at 308 out of 438 functions, so 70 percent coverage.

Conclusion

Dolt continues to strive for correctness, and we're always looking for ways to improve. Have any features or bugs you want us to prioritize? Make an issue on our Github Issues page. Feel free to make suggestions on our Discord.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.