Customer Per Branch
Dolt is the only SQL database with branches.
A very common Dolt usage pattern is what we call "customer per branch". The main branch of your Dolt database is used for schema and domain data. Each customer's data is stored on a separate customer-specific branch. Merge is used to manage schema migrations and domain table updates. Customer-specific data is isolated, preventing accidental modification and simplifying queries.
This article will explain the pattern and how it's used.
Example
The customer per branch design is best shown through an illustrative example.
Let's imagine you are building a multi-tenant industrial design software application. Your application masters a universe of parts, components that can be assembled to make machines. Your customers then use the application to group various parts in combinations to form machines. In addition, customers can create and specify custom parts, outside of your known inventory which are only visible to the customer that created them. At the database layer, these two concepts are modeled as tables aptly named parts
and machines
. Your application has two customers: comically-named Wayne Enterprises and Stark Industries.
After the application is deployed to production you build an additional manufacturers
table to service a customer feature request.
This type of application fits the "customer per branch" pattern very well.
Traditional Approach
With a traditional SQL database like MySQL or Postgres, customer data would stored in big, shared parts
and machines
tables.
The simplest way to manage who could view and edit which part or machine would be an owner column. Any view in your application would filter on this column. You could also create views for each individual customer expanding the number of tables in your application. If you want to stage new parts before going live with them, you could have an additional visibility column. Any conflicting edits to shared and customer specific parts would have to be detected at the application layer and a customer could not have a version of a shared part and vice versa.
Schema migrations would be deployed to every customer at once. So, in order to add the manufacturers
table for the feature request above, you make the schema change and it is live for every customer at once.
This approach would certainly work and many applications exist today using a traditional database to support this use case.
Customer Per Branch Approach
Let's build a a "customer per branch" database using Dolt to exhibit the pattern.
Dolt uses a Git-style commit graph to manage its versioning features. If you need a refresher on how a Git commit graph supports versioning, check out our Dolt documentation on the subject. The diagrams below represent branches in red and commits in blue.
Schema on main
For the customer per branch pattern, The main
branch is used to store schema and data that should be accessible to all branches.
In other words, you keep your schema and domain data on the main
branch.
Domain data is common data shared across all branches. Domain data is generally look up tables like state code to state name or IP address to country. In our example case, our domain data is the shared set of available parts.
So, for our example, we create a database named designs
which implicitly creates an initial Dolt commit. Then we create a parts
and machines
table. Next, we insert the initial set of parts into the parts table. Finally, we create a Dolt commit so our new branches can use the changes. As you can see this is all done using standard SQL. In Dolt, the Git write operations are exposed as procedures. The Git read operations are exposed as system tables or functions.
Customer Branches
In the customer per branch pattern, each customer gets an individual, isolated branch. For our example, we create two branches wayne
and stark
to hold data specific to each customer, Wayne Enterprises and Stark Industries. In this example we create the customer branches with call dolt_branch()
.
Moving forward, customer specific data is modified on individual branches. In Dolt, you can switch branches in a number of ways. In this example we use call dolt_checkout()
. The point here is that customer data only lives on that customers branch. Periodically, a Dolt commit can optionally be made for backup or audit purposes. A Dolt commit is only required when you want to merge changes from main
.
Note, in this example, we are modifying the parts
table on our customer branch and potentially also main
. You may want to add new parts to the shared parts table in parallel with a customer adding their own custom parts. Dolt can merge changes from the main
parts table without conflicts as long as the branches modify different database "cells" (ie. row, column pairs). If two branches modify the same cell, a conflict will be detected on merge and can be resolved using a conflict resolution workflow.
Adding a new customer is a simple as creating a new branch from main
.
Schema Migrations
A big part of database development is schema migrations. In the customer per branch pattern, schema migrations are first made and tested on main
. Schema migrations are then adopted by customer branches using merge.
Let's say in support of the manufacturers customer feature request above, you must add a manufacturers
table. We add a new table on main
called manufacturers
and make a Dolt commit.
After sufficient testing, we decide to merge the new table into customer branches. This can be done one customer at a time ensuring your changes don't break every customer at once.
Advantages
The customer per branch has a number of advantages over a traditional approach to building an application like the one described above.
Data Isolation
In the branch per customer pattern, customer data is isolated to a branch. This means it is harder to break, leak, or modify data in your application. SQL is run on branches. Switching branches is an extra SQL statement. These extra steps make it far more unlikely for your application or operators to make a mistake that spans multiple customers.
In addition, Dolt has branch permissions to further enforce branch protection rules, making it impossible for certain SQL users to modify certain branches.
Smaller Tables
The customer per branch pattern can result in significantly smaller tables making queries simpler and faster. A table is scoped to one customer instead of many. In a traditional set up you can add indexes on the owner or visibility columns or tables but this will not be as fast as the branch per customer approach. Moreover, queries do not need to be filtered on owner or visibility making queries easier to write and maintain.
Incremental Migrations
In traditional databases, schema migrations must happen for every customer at once. The branch per customer model enables incremental roll out of schema migrations to individual customers. If there is a problem, just use Dolt's powerful rollback capabilities to fix the individual customer. The blast radius of schema migrations is limited to a single customer.
Customer Per Clone
There is an advanced pattern we call "customer per clone" that is similar to "customer per branch" but also leverages Dolt's decentralized capabilities. Just like Git, Dolt supports clone, push, pull, and fetch meaning the same branch per customer pattern can be distributed across many individual physical hosts.
In the customer per branch set up, branches are maintained on the same host. Branches are logically separated but queries to branches share the resources of the single host. In the customer per clone set up, each branch effectively can live on its own host as a clone. Merge is done using pulls. So, if you need to get a domain data or schema migration from main, you pull it from the main server instead of merging it. Effectively, in this approach you can have a customer per host and manage coordination easily using Dolt primitives.
Customer per clone adds even greater data isolation. Your application must connect to a different physical server to see data from another customer. Moreover, customer queries are being run on a different hosts than other customers, meaning a particularly heavy query can only exhaust the resources of a single customer.
Conclusion
The customer per branch pattern is a very popular Dolt use case. In this article we walked through an example and explained the advantages of the approach. Finally, we introduced the customer per clone pattern which offers even greater data isolation than customer per branch. Questions? Come by our Discord and just ask.