Dolt Roadmap update

REFERENCE
4 min read

Introduction

At DoltHub, we ship features, a lot. We fix bugs in 24 hours, but we also have a lot of long-term projects we're working on at any given time. We keep track of these goals on our roadmap, which just got its periodic update on what's coming next and what we launched recently.

Our big recent milestone was the Beta release of Doltgres, which is our signal we believe Doltgres is ready to start handling production application development. We'll continue working on getting Doltgres to its 1.0 release, which will be our signal that its performance, interfaces, and storage are stable.

Here are some notable features that shipped recently, and what else we're working on while we get Doltgres ready for its 1.0 release.

Vector indexes

Vector databases are hot because of AI. Vectors are multi-dimensional arrays of floating-point numbers that you can search for approximate nearest neighbors. You can put vectors generated using LLMs into a database, and then querying those vectors provides similarity search on text, images, and video. You can convert your private data into vectors and then search it when a user enters a prompt meant for a chat AI, then augment the prompt with private data to improve the LLM response. This technique is called Retrieval Augmented Generation.

Vector index support launched at the beginning of the year. Try it out and let us know what you think!

Hosted GCP private cloud

Dolt is free and open source, so anyone can run it on their own hardware for free. But many customers prefer to pay us to run their databases for them via our hosting service. The hosted offering started out AWS only, then got support for Google Cloud Platform (GCP) in 2024. Then we added support for virtual private cloud (VPC) for AWS instances in July of last year.

Now GCP customers can choose the equivalent functionality for their hosted deployments. GCP private cloud support launched in March.

Stored procedure rewrite

Dolt's initial stored procedure implementation worked for most customers, but had some fatal flaws for advanced use cases. Namely: you couldn't create tables or do other schema modifications as part of a procedure, and calling other procedures that altered session state (like dolt_commit) wouldn't work correctly in all cases. Fixing this required a total rewrite, which we launched in April.

Doltgres triggers

Most of the SQL engine features we built for Dolt work out of the box for Postgres as well, since different SQL vendors are more similar than different when it comes to things specified by the SQL standard. An exception to this is triggers, which work quite differently in Postgres than in MySQL. Initial support for triggers in Doltgres launched shortly after the Beta release, in April.

Automatic garbage collection

Dolt generates a lot of garbage when issuing updates or inserts to a database, generating intermediate values that never make it into a commit graph. The dolt_gc stored procedure will clean up this garbage to reclaim disk space, but it's disruptive to the running server: all active sessions will disconnect while garbage collection runs. As of March, Dolt now supports configuration flags to enable concurrent garbage collection that can be used on a running server with no disruption to connected clients. And it will run periodically on its own, reclaiming disk space whenever it starts getting too large. This setting will become the default in a future release. Try it out now!

Archival storage

Dolt has been experimenting with different techniques to save storage space by compressing data more efficiently for about a year now. This technique is called archival storage, and it's been available as an experimental option since late 2024. As of April, we've achieved a consistent 25% reduction in storage size, but we think we can do quite a bit better. Archival storage will become the default in Dolt 2.0, when databases will be archived by default during garbage collection.

Doltgres extension support

Postgres extensions add new data types, functions and procedures to a postgres installation, and the Postgres community has a lot of extensions that are widely deployed, such as PostGIS. To get wide adoption from existing Postgres customers, Doltgres needs to support these extensions too. Other Postgres-compatible products have opted to re-implement the most common extensions manually, but for Doltgres we took a very different approach. Doltgres alpha support for loading extensions natively launched last month. This means that any Postgres extension should be usable in Doltgres using the native library. But note this is a work in progress -- there are some C-binding functions we need to implement before some extensions will work out of the box. Give it a try with your favorite and let us know what doesn't work!

Other cool things, yet unscheduled

We keep a list of things we think would be cool to implement but haven't committed to yet on our roadmap, in case a prospective customer really wants them and can tell us. Paying customers move to the front of the roadmap. Do any of these look interesting?

  • Encryption at rest. Dolt is already tamper-evident, but some compliance processes require data also be encrypted at rest. The best way to do this today is to place Dolt's data on an encrypted partition, but native support might be nice in some contexts.
  • History compression. Dolt takes more space than a typical database because it stores snapshots of every commit in history. If you only need data from a certain time in the past, or at a certain granularity, it would be nice if Dolt could compress the data, rewriting the commit history to eliminate all the unneeded commits and their storage.
  • Customizable merge rules. Dolt is the only database with merges. But Dolt's merge logic is pretty rigid, and doesn't do quite what some customers would want. There are workarounds, but they require writing a lot of application code. Dolt could instead let you specify merge logic as a stored procedure to make merges work better in your domain without application-layer cleanup.
  • Other frontend support. Dolt is a drop-in replacement for MySQL. Doltgres is a drop-in replacement for Postgres. Should we support other database front-ends, like MongoDB or Microsoft SQL Server?

Conclusion

It's an exciting time at DoltHub. Dolt just went 1.0 just over two years ago, and we've been amazed at everything our customers have built with it in that time. Doltgres is even newer, and it's been very gratifying to see new customers pick it up as well.

But we're just getting started. Do you need something from Dolt that it can't do yet? Let us know! Come by our Discord and we'd be happy to help you out.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.