Cursor Says Agents Need Database Branches

6 min read

We contend that Dolt is the only database with branches. Other databases support forks or schema branches but Dolt is the only database to support branches on both schema and data at the scale of millions of rows, versions, and branches.

Why does this matter? Well, branching is starting to get hot with the rise of Agentic AI. I wrote about this in detail a couple months ago and then Sualeh Asif of Cursor talked about it on Lex Friedman a couple weeks later.

Agentic Workflows

The gist is, you don't want AI agents making changes directly on the main branch, you want them writing on branches so a human can work with the agent to perfect their output. This is why coding editors with built in AI agents, like Visual Studio Code with GitHub Copilot, Cursor, and Windsurf are having a moment. The files that they edit are in version control so you're not worried about messing up your main branch. The problem is databases don't have branches so you're always at risk of messing up the main branch...unless you use Dolt.

This article will dive into what Sualeh said on Lex Friedman and provide our response.

Transcript

The full Lex Friedman transcript with Sualeh and other Cursor folks is available here. I've screenshotted the relevant section below:

Cursor on Database Branches

Let's respond to each individual section.

Set Up

Lex
How much interaction is there between the terminal and the code? How much information is gained from if you run the code in the terminal? Can you do a loop where it runs the code and suggests how to change the code? If the code and runtime gets an error? Is right now there’s separate worlds completely? I know you can do control K inside the terminal to help you write the code.
Aman
You can use terminal context as well inside of check command K kind of everything. We don’t have the looping part yet, so we suspect something like this could make a lot of sense. There’s a question of whether it happens in the foreground too or if it happens in the background like what we’ve been discussing.

This is just set up for the meat of the topic later. But you get the immediate idea that in order for agents to be effective they need to be interacting with the human and the running software to get feedback.

Enter Databases

Lex
Sure. The background’s pretty cool. I could be running the code in different ways. Plus there’s a database side to this, which how do you protect it from not modifying the database, but okay.

Thank you Lex. Indeed, most applications are backed by databases so you want to make sure the AI agent isn't modifying production data.

Planetscale

Sualeh
I mean, there’s certainly cool solutions there. There’s this new API that is being developed for… It’s not in AWS, but it certainly… I think it’s in PlanetScale. I don’t know if PlanetScale was the first one to you add it.

Planetscale is a very popular MySQL-compatible database. Both Dolt and Planetscale use the same open source library, Vitess, as our SQL engine. Planetscale does support schema branches but data branches cannot be merged. Data branches in Planetscale are forks.

Given what the Cursor folks are looking for with their agentic use case, I don't think Planetscale will fit the bill. What happens when two agents in making updates in parallel produce good data changes? Without merge and conflict detection on data, managing the merge process is going to require a lot of custom code and potentially be very slow. Dolt is the only database to support merge on data.

Moreover, it's unclear how Planetscale shares similar data between branches in storage. If each data branch is a fork, storage will balloon with each fork. Dolt shares common storage between branches.

Planetscale does seem to be doing a better job marketing that they are a database with branches which makes me sad. More blog articles about Dolt I guess.

Database Branching

Sualeh (continues)
It’s this ability sort of add branches to a database, which is like if you’re working on a feature and you want to test against the broad database, but you don’t actually want to test against the broad database, you could sort of add a branch to the database.

Here at DoltHub, we've been working on a version controlled database for almost seven years now. We built on an open source project called Noms that was over three years old when we started using it. Look at Dolt's commit log, it turned 10 on June 2nd.

$ pwd
/Users/timsehn/dolthub/git/dolt
$ git log --reverse | head -6
commit 68c3ac02058e559367534aeeb7d9f8f483a4db1b
Author: Aaron Boodman <aaron@aaronboodman.com>
Date:   Tue Jun 2 20:45:33 2015 -0700

    first commit

This is all to say, we think we're the world's experts on databases branches. I wrote a whole article outlining the databases we've found that claim to support branches: Turso, Neon, LakeFS and Planetscale. Dolt is the only database to support true branching on data and schema.

Dolt implements Git's commit graph on a novel content-addressed B-tree called a Prolly Tree. Practically, this mean you get Git-style branch and merge on tables instead of files. Want a new branch run dolt branch or dolt checkout -b. Want to see what has changed on your branch, run dolt diff. The full suite of Git commands is implemented, including rebase.

Write Ahead Log

Sualeh (continues)
And the way they do that is they add a branch to the write-ahead log.

This is a bit of a tangent by Sualeh. I'm not aware of any database branching solution that uses the write ahead log to facilitate branching. Neon uses a copy on write file system for its branching features.

SQL databases support the concept of a query log. The query log stores all the write queries you make to your databases in order, keyed by a transaction identifier (id). This query log facilitates the backup, restore, and replication features of SQL databases. It is also leveraged for change data capture. In Postgres, this query log is called the "write-ahead log" and in MySQL it is called the binlog. We don't think the query log can be used effectively to support branching.

Hard Problem

Sualeh (continues)
And there’s obviously a lot of technical complexity in doing it correctly. I guess database companies need new things to do. They have good databases now.

Sualeh caught us here. Guilty as charged. Traditional databases are boring. Version control is technically complex. Databases are good but they could be better if they were version controlled.

As I said, we've been working on our version controlled database, Dolt, for seven years. It's taken many innovations, including a novel storage engine, to do it correctly.

Turbopuffer

Sualeh (continues)
And I think turbopuffer, which is one of the databases we use, is going to add maybe branching to the write-ahead log.

I had never heard of Turbopuffer before Sualeh mentioned it. Turbopuffer is a hosted, closed source, document database with vector search capabilities built on top of AWS S3. Here at Dolt, we're in the SQL database business so I tend to keep track of developments there but I always like to learn about new databases. No news in Turbopuffers blog about branch support so maybe Sualeh scooped them?

Everything Needs Branches

Sualeh (continues)
So maybe the AI agents will use branching, they’ll test against some branch, and it’s sort of going to be a requirement for the database to support branching or something.
Aman
It would be really interesting if you could branch a file system, right?
Sualeh
Yeah. I feel like everything needs branching. It’s like-
Aman
Yeah.
Lex
Yeah. The problem with the multiverse, right? If you branch on everything that’s like a lot.

Sualeh, Aman, and Lex all agree everything an AI agent operates on is going to need branches, likening the problem to that of the multiverse. We're confident the solution we built here at DoltHub is the best branching database out there. We're biased but we think Dolt is best in class for this use case. If you don't believe us, just try it out. It's free and open source. It's a drop in replacement for either MySQL or Postgres.

Clever Algorithms

Sualeh
There’s obviously these super clever algorithms to make sure that you don’t actually use a lot of space or CPU or whatever.

The algorithms Dolt uses to achieve version control at database scale is a Git-style commit graph combined with a novel content-addressed B-tree called a Prolly Tree. Clever is in the eye of the beholder so read about them or look at the code and you tell us.

Conclusion

After listening to this interview, I tried to reach out to Sualeh via email but Cursor's mail server rejects external email. They must be overwhelmed. But, we want to talk to Sualeh and Aman about version controlled databases! If you know either of them or anyone at Cursor, send them a link to this blog. Our mail server accepts inbound mail. I'm tim@dolthub.com. If you just want to chat about database branches, just stop by our Discord and we're happy to talk your ear off.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.