- REFERENCE9 min read
So you want Database Versioning?
Here at DoltHub, we've had a lot of success with our "So you want..." series of blog posts helping people find Dolt when they are looking for it. Dolt is a lot of things. Dolt is a version controlled database, a Git database, Git for data, data…
Read More
- DATASET11 min read
FBI Crime Data and the Future of Data Distribution
Dolt is Git for data and DoltHub hosts a growing collection of public open datasets. Recently, we created dolthub/fbi-nibrs reflecting the FBI's National Incident Based Reporting System (NIBRS) crime data. Law enforcement agencies from a...
Read More - WEB3 min read
Open Source Cypress Testing Suite
Dolt is Git for data and DoltHub is our web application that hosts Dolt repositories. At the beginning of the year we redesigned DoltHub and decided to try out Cypress as our end-to-end testing solution (similar to how we use Bats tests f...
Read More - DATASET5 min read
Collaborative GPT-3 Dataset
Dolt is Git for data . Recently, we've been thinking a lot about what could be Dolt's Linux . A reader of that blog had a suggestion, an open GPT-3 dataset . Dolt really shines as a collaborative database where many users are making dist...
Read More - WEB6 min read
Testing DoltHub Using Cypress
Dolt is Git for data and DoltHub is our web application that houses Dolt repositories. At the beginning of Dolt , we adopted Bash Automated Testing System (Bats) for end-to-end testing of the Dolt command-line (check out our blog about Ba...
Read More - 8 min read
Data Integrity for Open Data
Open Data Validation Recently an article made the rounds at our company about "data integrity" checks. The article advocates that in the absence of perfect code that never corrupts data, it's wise to have "data integrity checks" that ensur...
Read More - FEATURE RELEASESQL7 min read
Implementing subqueries in go-mysql-server
Dolt is Git for data. Git versions files. Dolt versions SQL tables. Dolt's SQL engine is go-mysql-server , which is an open source project that we adopted a few months ago. Today we're excited to announce better support for subqueries in the en...
Read More - DATASET2 min read
July Dataset Spotlight
Every month we highlight some interesting datasets on DoltHub. The focus is on new or updated datasets but sometimes we shed fresh light on a classic. For those new to Dolt and DoltHub , Dolt is Git for data. Git versions files. Dolt versio...
Read More - 5 min read
The Anatomy of Open Data Projects
A core motivation for building DoltHub was to empower organizations to collaboratively create and maintain high quality data assets that they could collectively depend on. This is very much analogous to GitHub. Analogies are powerful ways to…
Read More - DATASET5 min read
Scraping LinkedIn
On June 13th, 2016 Microsoft acquired LinkedIn for $26.2 billion due to its ability to successfully monetize the resumes of its users. They have proven the value of a resume database and sell premium services that let recruiters search this databas...
Read More - 10 min read
Data Dependencies Using DoltHub, an Example
Introduction In the past we have blogged about the IRS Sources of Income (SOI) data that we harvested and published as a Dolt database . We presented a compelling visualization that was relatively straightforward to create using that dat...
Read More - 6 min read
Being a Startup in COVID-19 Times
Today, we're taking a break from our regularly scheduled Dolt and DoltHub content to talk about our experience as a ten person startup in Los Angeles over the past few months as we've all dealt with this pandemic. In the beginning... ...
Read More - 8 min read
In Search of Dolt's Linux...
Dolt is a SQL database with Git-style versioning. DoltHub is a place on the internet to share Dolt databases. In this blog post we discuss our search for Dolt's Linux. Git Git was built to manage the Linux open source project. Lore ha...
Read More - FEATURE RELEASEWEB1 min read
Announcing Username and Password Login
DoltHub is a web application for hosting and collaborating on Dolt repositories. Until now, DoltHub has only supported creating accounts and signing in with third-party providers - currently Google and GitHub . We're excited to announce t...
Read More - REFERENCE7 min read
Cell-level Three-way Merge in Dolt
Dolt is a SQL database with Git-like functionality. It supports version control primitives including commit, branch, merge, clone, push and pull. This is the fourth post in a series exploring how Dolt stores table data implements these version…
Read More - DATASET5 min read
Data Dependencies Using DoltHub
A core motivation for the DoltHub team is a belief that obtaining and distributing data should be seamless and robust. Correctness and power combined with simplicity make for positive user experiences. We want users to think in terms of queries ...
Read More - FEATURE RELEASE4 min read
Introducing Foreign Keys
Dolt is a SQL database with Git-style versioning. With each new version of Dolt , we increase the number of supported SQL features, moving toward our goal of being a complete drop-in replacement for MySQL, while adding all of the versioning fe...
Read More - 8 min read
Migrating from Jenkins to Github Actions
Dolt is a SQL database with Git-style versioning. DoltHub is the place on the internet to share Dolt databases. For both Dolt and DoltHub, we've always used Jenkins for our continuous integration pipeline but have recently migrated our Dolt…
Read More - DATASET6 min read
Open Elections data on DoltHub
DoltHub is a collaboration platform for data stored in Dolt , a relational database and data storage format with Git-like version control features for structured data. The vision of Dolt and DoltHub together is empowering decentralized communit...
Read More - FEATURE RELEASESQL8 min read
Diffing Queries in Dolt
Dolt is a SQL database built to wrangle datasets. Its tables are versioned, queryable, and shareable. We've recreated Git's functionality in a relational database so you can collaborate on data in the same ways you collaborate on code. One of Dolt'...
Read More - DATASET3 min read
June Dataset Spotlight
Every month we highlight some interesting datasets on DoltHub. The focus is on new or updated datasets but sometimes we shed fresh light on a classic. For those new to Dolt and DoltHub , Dolt is Git for data. Git versions files. Dolt versio...
Read More