Vibe Coding with Cursor
All the technology world is abuzz about AI. In particular, how it's going to take my job as a software developer. I like my job, and I don't really want to give it to a robot, so I thought I'd try out some of this AI hotness myself. I wanted to see what all the fuss is about.
I write code for the Dolt Database - the worlds first SQL database which support version control features branching and merging. We're a small team of 15, and virtually all of our time is spent writing code. We are not oblivious to the AI hype train. When you hear over and over that AI empowered developers are 10X more productive, it would be foolish not to give it a hard look.
One of the tools which has been getting a lot of attention is Cursor: "The AI Code Editor". I've been using it for a couple months, and I have thoughts. There are two ways I'd like to look at this that I'll explore:
- Can I use Cursor to help me write code faster?
- Can someone use Cursor to build an application on Dolt?
It's Saturday, and Dolt usually only does blog posts on the weekdays. Since I have a lot to say about Goal (1), and it's not really Dolt related, I thought I'd do a double blog post. Today's post is about Topic (1). Monday, I'll talk about Topic (2). Stay Tuned!
Cursor False Start
I installed Cursor (trial version) about 2 months ago and tried using it directly on the Dolt codebase. The Dolt codebase is a moderately large codebase. The first code is from more than 10 years ago from the Noms project. The primary IDE we use at Dolt is Goland. So I thought I'd turn off Goland, and point Cursor at the code base instead.
I was underwhelmed.
I "write software" for a living, which means I spend most of my time reading and debugging code, then a little code gets written. So after I "prompted" through writing a very small amount of code, I was left with the task of debugging. In Goland, it's a few clicks to debug any Go code. In Cursor, I was asked to create a launch.json
file. Great, let's prompt that thing into existence. After 20 minutes of prompts, I was still no closer to debugging than I was before. Dolt's codebase is a little more complex than a simple Go program, but it's not that complex. You need to set a build directory and the current working directory, and the arguments to pass to the program. Not rocket science.
Yeah, it told me to go back to Goland. At least it knows it's limits. Good Robot. I uninstalled Cursor with no regrets.
Cursor for Fun
A couple weeks ago, I upgraded to the paid version of Cursor. The reason I did this was mainly as a way to yell back at the internet that I really tried this stuff out and I think it's nonsense. Also, when I started Cursor, my trial was over and it didn't work at all. Their little scheme worked, and now I'm $20 poorer!
The main advice I heard from my peers was not to try Cursor on an existing codebase. Instead, I should start a new project with Cursor. So I did. I created a Factorio Mod, which was in a language I didn't know (lua) and a framework I didn't know. I got something working after a couple hours, and I had some laughs along the way. I was in YOLO mode, and when I asked "Do you know what Dolt Is?", it went ahead and installed Dolt without asking. Whoa! What!?! Also, it would just write over plugin files while Factorio was running. I politely asked it to stop, and it would stop for 5 minutes or so. Then it would overwrite them again. This was all hilarious to watch because there was nothing at stake.
Cursor for Work
Ok, I'm starting to get the hang of this. There was a need to write a small test tool for a work project, so I thought I'd give Cursor a try at work. New code, not production quality, and it probably won't be needed in the future. Also, I decided to use Go which is what Dolt is written in, so Cursor would not be protected by my ignorance of the language this time. YOLO was out of the question. I wasn't interested in it grabbing my GitHub keys and going to town. After watching it install whatever it wanted on my personal machine, I knew I needed to be more careful.
So, I dove in with an empty Git repo, and got to work. What I built was a tool to help me profile the compression performance of Dolt's Archive Storage format. At DoltHub, we have many public repositories, and I wanted to clone all of them and run some commands on each of them. There were about 6 different ways I wanted to measure runtime and size of databases. The tool would run those programs, and store results in a Dolt database.
Cursor started strong. By giving simple prompts, I got far enough to:
- Connect to a Dolt database
- Initialize the database, and create schema.
- Bootstrap the database with a CSV file that I had already created.
- Clone a specific Dolt repository.
- Perform an archive of the first test repository.
- Write the results to the database.
- Create a Dolt Commit.
This was all without writing a single line of code myself. Pretty cool!
...but the code was not great. All code was in the main()
function. I set about cleaning it up into something I could be proud of. I performed what I thought was an obvious behavior - I selected a block of code, and asked Cursor to extract it into a function, call it Foo()
. What is did was delete all the code I had selected, and inserted the text "Foo()" in it's place. Where did my code go? It was gone. YOLO. It's a good thing I was committing regularly to Git because otherwise I would have been sad.
Also, there were calls to log.Fatalf
at every step - which terminates the program. Even if I would give a prompt like "Handle this error by doing XYZ..." It would still put in a log.Fatalf
call before the added logic. It's like it didn't understand what log.FatalF
does (sarcasm). So, as I got further along, I started noticing I was writing more of the code myself.
Finally I needed to swap out the use of the github.com/go-sql-driver/mysql
instead of the default package chosen by Cursor. The reasons aren't important, but it was a pain to do.
This all confirmed my suspicions. Generating code is great for about 5% of the work. Refactoring and consistently filing down rough edges is the other 95%. After getting the basic level of functionality pinned down, I found myself spending more time in Goland refactoring code than I was in Cursor.
It's not all bad. I was pretty impressed that I could say fairly specific things like "I need a unique index on this column" and it would write the gorm
code to do that. All in all, I would say that Cursor made me about 50% faster on this project, which is nothing to sneeze at. It's just no where close to the 10X productivity I've heard so much about. Also, most of my time is spent in the main codebase, so this was not a representative sample of how fast the tool would make me in my day to day work.
Trying So Hard
In one last ditch effort, I got all the packages Dolt developers work on regularly in one directory, and I pointed Cursor at that. My hope was to index the code base and use it as an agent which knows about the entire domain of Dolt. This includes all our code, blogs, operational scripts, documentation, etc. Everything.
It took 24 hours to index. I'm not kidding. I left it running while I went to bed, and it was still indexing when I woke up. Had to walk to work with my laptop open because I wanted it to finish. Once it was indexed, it did meet my goal, to some degree. I could ask questions like "What compression algorithms does Dolt use?" and it actually knew the answer. Neat! Where is that implemented? It opened the appropriate files. Great! And yet, I asked it to generate some documentation about some code I'd already written in the codebase, and it was just flat out wrong. It documented configuration variables that didn't exist, and defaults that were off by a factor of 1000. I kind of expected it to be really good at that task, so this made me doubly sad. That's the part of the job I'd be happy to give to a robot.
Is it Me?
The invariant in these tests have been me. I could just be a bad prompter. My experience thus far has been interacting directly with the tool in a vacuum. Intentionally. I fully confess I have not gone into the interwebs and nerded out on the topic. I conversed with it a single sentence at a time. I generally stayed in the generate
prompt and would usually add to the prompt 2 or 3 times before accepting the result. Felt pretty natural honestly. When chatting, I bounced between the Agent
, Edit
, and Ask
modes, but kind of wanted there to be just one mode to rule them all. Then there is the question of which model to use. I dunno, I didn't go deep on that but didn't notice any difference when I changed it.
Ultimately more training is always good, so that's probably the next step for me. Could be this is just too new, and the user experience is a little rough. Cursor is new, and maybe it will give me a tutorial in the future to cure my ignorance.
Hater?
I'm not a hater. Well, maybe I am. I hate the hype. Anthropic's CEO believes that no one is going to write code in a year is just insane. That's someone selling a pipe dream. No, actually, that's someone selling an ayahuasca trip. How could anyone genuinely believe this, and why would anyone else give them money?
Developers are practical people. We don't need to be sold on 1000% improvements which don't exist. We'd be happy with 50% improvement on the actual work we do. That would be amazing. Deliver that, and we'll get onboard.
In the meantime, I will keep using Cursor. Knocking out small tools and such comes up enough that it's worth it. It did make functional code for sure. For what it's worth, I'm using it to write this blog post. It's pretty amazing that I can ask the tool:
>> in the text that has changed compared to what has been checked in with git, are there any typos
and it provides a useful response:
That's pretty cool, I'm not gonna to lie. I look forward to the improvements everyone is saying are right around the corner even if it takes longer than we think.
Come tell me how I'm wrong on the Dolt Discord!