Dolt is a version-controlled database that works as a drop-in MySQL replacement. Consequently, we aim to match MySQL’s performance as closely as possible. In some cases, Dolt actually outperforms MySQL, like our Sysbench Read/Write metrics.
TPC-C is an industry standard benchmark for OLTP databases that Dolt has struggled to keep pace with MySQL. You can read more about TPC-C and our benchmarking in greater detail in our previous blog.
Back in 2021, we got our first apples-to-apples comparison working.
Our initial results weren’t very impressive; we mainly focus on transactions per-second (tps), and apparently we were 71 times slower than MySQL.
In that blog, we set “the ultimate goal of being no worse than 2x slower than MySQL”.
Today, we’re happy to announce that Dolt is now well within that goal; as of Dolt v2.0.2, we are 1.8x MySQL’s TPC-C performance!
This blog will discuss the major improvements that got us there.
Nice AutoGC Scheduler#
When writing to tables, Dolt produces a ton of garbage.
As a result, the .dolt directory can grow unnecessarily large.
To combat this, we wrote AutoGC, which runs occasionally to clear out this garbage.
However, this process can intermittently lock writes to the database, slowing down queries. To address this issue, we implemented a scheduler for AutoGC. It’s not very clever, but it’s “nicer” to busy computers; AutoGC now only runs when CPU load is low. Eventually, we wrote a follow-up to prevent delaying AutoGC for too long.
This scheduler brought us from 38.32tps to 42.15tps, which is about 10% increase, bringing us within 2.2x MySQL.
Individual Prolly Table Flusher#
The next optimization revolves around a partial rewrite of the prollyWriteSession and prollyTableWriter.
This section of the code is in charge of writing changes to tables to disk.
The original implementation was pretty messy.
Both prollyWriteSession and prollyTableWriter relied on logic from each other to materialize table(s), keep the working set up-to-date, and handle auto-increment with no clear distinction on which struct is in charge of what.
Now, these have been rewritten so that prollyWriteSession is in charge of managing each of the prollyTableWriters and updating the working set, while prollyTableWriter is in charge of materializing itself and handling the auto-increment values.
With a clear distinction of responsibilities, we were able to allow the prollyWriteSession to flush a single prollyTableWriter at a time, avoiding concurrency overhead when unneeded.
Additionally, the concurrency logic in prollyWriteSession.flushAll() was rewritten from a map guarded by a mutex to just some threads on a channel.
This refactor brought us from 45.51tps to 50.42tps, which is another 10% improvement, bringing us within 1.8x MySQL.
Other Improvements#
There were a few much smaller improvements we made to improve our TPC-C performance that account for roughly 3 more tps total.
These are two of the more notable ones.
Skip No-Op Updates#
Rather than running a Put and Delete on the mutable maps for Updates that don’t do anything, we check for that and ignore them.
This change was made in this PR.
Simplified Weibull#
To maintain a 4KB chunk-size, we use a weibull distribution to determine whether to split a chunk on write.
A costly part of the math is math.Exp(x,4), which is avoidable by just expanding the exponent to x * x * x * x.
This change was made in this PR.
Conclusion#
Dolt has been making huge improvements to TPC-C over the years, and we’ve finally hit the goal we set for ourselves back in 2021. As usual, we are going to continue to improve Dolt performance in as many areas as we can. Notice any performance issues? Cut a bug on our GitHub issues page. Want to discuss performance improvements with people on our team? Join our Discord.