1. REFERENCE
    9 min read

    So you want Database Versioning?

    Tim Sehn|

    Here at DoltHub, we've had a lot of success with our "So you want..." series of blog posts helping people find Dolt when they are looking for it. Dolt is a lot of things. Dolt is a version controlled database, a Git database, Git for data, data…

    Read More
Found 20 matching articles.
  1. DATASET
    5 min read

    NOAA Global Hourly Surface Data

    The National Oceanic and Atmospheric Administration, NOAA, publishes weather measurements taken from stations around the world. It started in 1901 with a handful of stations, and there are more than 35,000 stations today. Most of these stations…

    Read More
  2. FEATURE RELEASE
    7 min read

    Announcing Saved Queries

    Dolt is Git for data. We built Dolt to help teams collaborate on data sets using the forking, branching, and merging workflows that Git popularized. These workflows are what enable software engineers to collaborate on source code, and they...

    Read More
  3. 3 min read

    Copyrightable Material

    In our previous blog post we examined some freely available licensing tools for open data from Creative Commons. To briefly recap a license specifies the terms under which copyrightable material is made available for public access, sharply dis...

    Read More
  4. 2 min read

    Data Licensing

    Introduction Dolt is a data format. DoltHub is a collaboration platform for data stored in the Dolt format. When sharing copyrighted content the terms of that sharing are governed by a license. In this post we highlight some common licen...

    Read More
  5. DATASET
    8 min read

    Novel Coronavirus Dataset in Dolt

    John Hopkins University Center for Systems Science and Engineering began collecting, tabulating, and publishing Novel Coronavirus (COVID-19) data on January 31, 2020. We started importing this dataset into Dolt on February 5, 2020. This blo...

    Read More
  6. REFERENCEWEB
    4 min read

    How We Built DoltHub: Introduction

    Towards the end of last month, we launched a totally reworked and redesigned version of DoltHub , our web application for hosting and collaborating on Dolt repositories. Now that we've had a little while to iron the kinks out, it seems like...

    Read More
  7. 2 min read

    Dolt and DoltHub Documentation

    Background We are excited to announce the launch of our documentation site . The goal of Dolt and DoltHub is to enable developers and the data community with radically better data infrastructure. High quality documentation should empo...

    Read More
  8. SQL
    11 min read

    Implementing indexed joins

    Happy Valentines Day from all of us at DoltHub ! You are the reason we do what we do! In honor of the holiday, we want to talk about how much we love making queries faster. We're going to examine how our...

    Read More
  9. FEATURE RELEASE
    4 min read

    LICENSE.md and README.md in Dolt

    Dolt and DoltHub strive to be the best data distribution platform on the internet. Having documentation versioned alongside data, and a standard, easy way to read the documentation online are features we admire in Git and GitHub. Following ...

    Read More
  10. FEATURE RELEASESQL
    7 min read

    Introducing SQL VIEW Support in Dolt

    Dolt is a SQL database with Git-style versioning and distribution. The most recent releases of Dolt introduced support for SQL views that are stored as part of, and versioned along with, a Dolt repository. This provides a great way for data sets ...

    Read More
  11. DATASET
    4 min read

    Mapping Income Inequality using IRS SOI Data

    In a previous blog I showed how the history of a dataset can be queried using the dolt history tables, and in the first part of this 2 part blog I covered the IRS SOI data . In this second part I use the IRS SOI data along with doltpy ...

    Read More
  12. REFERENCE
    8 min read

    Dolt and DoltHub: Getting Started

    Dolt is a SQL database with Git-style versioning. In Git the unit of versioning is files. In Dolt, the unit of versioning is SQL tables. Dolt will eventually support 100% of the Git command line and 100% of MySQL SQL. Moreover, anything you can d...

    Read More
  13. DATASET
    6 min read

    IRS Sources Of Income Dataset

    Every year the IRS publishes a treasure trove of data. It contains over a hundred different metrics which provide insight into the finances of American taxpayers. Even more compelling is they provide this information at ZIP code granularity, which…...

    Read More
  14. FEATURE RELEASEWEB
    2 min read

    Querying DoltHub Repositories with SQL

    Since its launch in 2008, GitHub has catalyzed the open source software world and accelerated the culture of software collaboration. Source control was an old idea at that point, but GitHub offered a centralized place to discover and collaborate...

    Read More
  15. SQL
    8 min read

    Access to Everything Through SQL

    When we started developing Dolt our vision was to deliver git functionality for data. Where git versions files, Dolt versions tables. We implemented table based diff and conflict logic and shipped the initial version. As we started to use Do...

    Read More
  16. FEATURE RELEASEWEB
    4 min read

    DoltHub Redesign

    Redesigning DoltHub Dolt is a database and a data format. DoltHub is a way of hosting and collaborating on Dolt databases. We decided to redesign DoltHub to make it more user friendly. We are excited to announce that we have released the resu...

    Read More
  17. SQL
    5 min read

    Getting to one 9 of SQL correctness in Dolt

    A few months ago we finally settled on a good way to measure the correctness of Dolt's SQL engine: the sqllogictest package, first developed for SQLite and since used as a benchmark for lots of other database implementations. SQLite hit u...

    Read More
  18. 5 min read

    The History of Data Exchange

    IBM and General Electric invented the first databases in the early 1960s. It was only by the early 1970s that enough data had accumulated in databases that the need to transfer data between databases emerged. Enter the Comma Separated Values (CSV…

    Read More
  19. DATASET
    5 min read

    Maintained Wikipedia ngrams dataset in Dolt

    Wikipedia is the largest and most popular general reference work on the internet, making it a powerful tool for predictive language modeling. Wikipedia releases a dump of all its articles and pages twice a month, and we created a dataset of...

    Read More
  20. DATASET
    5 min read

    2 billion primes in a Dolt table

    Since releasing Dolt , we have often been asked how it scales. How many rows and how many gigs can you get into a Dolt dataset before things start breaking badly? Answering this question in practice is kind of difficult, simply because it'...

    Read More
JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.