Hosted DoltDB now comes with a SQL workbench
Dolt is a MySQL-compatible database with Git like features. In May we launched Hosted DoltDB, a cloud-hosted Dolt database with built-in logs and monitoring. If you're not familiar with Hosted Dolt, here are some blogs to get started:
We recently launched a SQL workbench for your Hosted Dolt deployments, where you can access a read-only DoltHub-like database UI. The writable version of this UI is coming soon.
How to use the Hosted SQL Workbench
You can get started with the Hosted SQL workbench in just a few steps. If you already have a started instance you can skip to step 3.
1. Create a deployment
2. Add some data
We're using the dolt_clone
procedure
to clone DoltHub's museum-collections
database.
% mysql -h"dolthub-museum-collections.dbs.hosted.doltdb.com" -u"[username]" -p"[password]"
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.7.9-Vitess
Copyright (c) 2000, 2022, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> call dolt_clone("https://doltremoteapi.dolthub.com/dolthub/museum-collections");
+--------+
| status |
+--------+
| 0 |
+--------+
1 row in set (1 min 44.26 sec)
mysql> use museum-collections;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> show tables;
+------------------------------+
| Tables_in_museum-collections |
+------------------------------+
| objects |
+------------------------------+
1 row in set (0.04 sec)
3. Navigate to workbench tab
4. Add the hosted-ui-reader
user and then click Try again
mysql> CREATE USER 'hosted-ui-reader'@'%' IDENTIFIED WITH authentication_dolt_jwt AS 'jwks=doltdb_hosted,aud=deploymentOwners/dolthub/deployments/museum-collections,sub=hosted-ui-reader';
Query OK, 0 rows affected (0.05 sec)
mysql> GRANT SELECT ON *.* TO 'hosted-ui-reader'@'%';
Query OK, 0 rows affected (0.05 sec)
5. Choose a database
6. View your data
You can run queries, explore schemas, view recent changes and what changed at each commit using this workbench.
How it was built
The Hosted SQL workbench is modeled after the DoltHub database page. Similar to the Hosted website, the DoltHub website is a Next.js application that communicates with a GraphQL server, which provides a translation layer for DoltHub API. DoltHub API is a Golang service providing GRPC endpoints which drive the website.
Before we started this project we discussed whether we should create a React component library that we can share between Hosted and DoltHub. Most of the UI for the Hosted workbench was going to be the same as the DoltHub database page. I researched some options, including Bit and building the library ourselves, but ultimately prioritized time and decided to hard-fork the DoltHub components to use for the Hosted workbench.
To make that process easier, we wanted to maintain the same GraphQL schema for the queries used for DoltHub and Hosted. Unlike DoltHub's GraphQL server which gets database metadata from our DoltHub API services, our Hosted GraphQL server would directly communicate with the SQL server running on the Hosted instance. While the structure of the requests and returned objects would be mostly identical, the rest of logic for the UI needed to be rewritten. The Hosted GraphQL server needed a new service that could manage multiple database connections and the database metadata query logic needed to be written in SQL using Dolt system tables and functions.
Comparing DoltHub and Hosted GraphQL servers
Both DoltHub and the Hosted SQL workbench have this dropdown that lists the database branches. I'm going to use it as a simple example to convey the changes that needed to be made to implement the DoltHub UI against the Hosted SQL servers.
The ListBranchNames
query is straightforward. You give it the owner and database names
and it returns a list of all the branches in the database as strings. On DoltHub,
ListBranchNames
is a GRPC endpoint that communicates with DoltHub API (a Golang service)
to get branch names from the Dolt storage layer, also called a ChunkStore
, which is
stored in AWS S3. This service first validates the request to ensure the calling user has
permission to see the database and that the database exists on DoltHub. It then uses the
read-only chunk store interface to query Dolt for the branch data. This looks something
like:
func (impl *repositoryDataUseCaseImpl) ListBranchRefs(ctx context.Context, owner, repoName string) ([]ref.DoltRef, error) {
cs, sizer, err := impl.csFactory(ctx, owner, repoName, true)
if err != nil {
return nil, err
}
readOnlyCs := dhdolt.NewReadOnlyChunkStore(cs)
doltDatabase := &DoltDatabase{doltdb.DoltDBFromCS(readOnlyCs), readOnlyCs, sizer, owner, repoName}
branchRefs, err := doltDatabase.GetBranches(ctx)
if err != nil {
return nil, err
}
return branchRefs, nil
}
In our DoltHub GraphQL server, we have a branch resolver (we use Apollo to build our GraphQL server in Typescript) that uses this GRPC endpoint to get the branch names:
@Query(_returns => BranchNamesList)
async branchNames(
@Context() context,
@Args() args: RepoArgs,
): Promise<BranchNamesList> {
const api = this.apiProvider.getClient(context);
const req = new ListBranchNamesRequest();
req.setRepositoryName(RepositoryUtils.rn.fromParams(args));
const res = await api.listBranchNames(req);
return { list: res.getBranchNamesList() };
}
Since we need to communicate directly with the SQL server to list the branches for the
Hosted SQL workbench, the resolver looks different (we use typeorm
with Apollo in Hosted GraphQL server). We need to first get the correct database connection,
grab a connection from the connection pool, and then use the dolt_branches
system
table
to get the branch names.
@Query(_returns => BranchNamesList)
async branchNames(
@Context() context,
@Args() args: RepoArgs,
): Promise<BranchNamesList> {
// Gets (or adds if doesn't exist) the database connection
const conn = await this.dbConnect.connection(context, args);
// Takes a single connection from the connection pool
const queryRunner = conn.dataSource.createQueryRunner();
await queryRunner.connect();
const branches = await queryRunner.query("SELECT * FROM dolt_branches");
await queryRunner.release();
return {
list: branches.map(b => b.name),
};
}
This process is similar for querying other database metadata, but unlike branches which are not revision-dependent, metadata like tables and commits need to be queried at a certain point in the commit history. We make heavy use of revision databases to make this work.
For example, to list tables for a branch, we can use an AS OF
clause: SHOW TABLES AS OF 'feature_branch'
. Or for user-run SQL queries we can use a USE
statement to specify the branch before the query is run, like:
USE `museum-collections/main`;
SELECT * FROM objects WHERE category="drawing";
Authentication model
In order to implement this read-only database workbench, we needed some API machinery to access the SQL server running on a deployed hosted instance, and we needed the API to authenticate to the SQL server. Before this feature, Dolt only supported username/password authentication.
One way we could have handled this was to deploy the Hosted instances with a pre-configured username and password, which the API could use to access the database. However, these credentials would have been harder to make short lived and access to the database with the credentials would not have necessarily come with progeny information.
Instead, we built an authentication mechanism into Dolt which allows user accounts to
authenticate using signed JWTs. In turn, our Hosted API was extended to issue signed JWTs
for accessing the read-only account associated with UI (hosted-ui-reader
). When someone
attempts to connect to the Hosted SQL server, the GraphQL server calls Hosted API to get
short-term credentials, and then connects to the Hosted SQL server using them.
The JWT-based authentication scheme has the following properties:
- Credentials are non-forgeable.
- Credentials are short-lived.
- Credential creation and use is auditable – logs can record when they are minted, as a result of which requests, and when they are used.
- Users can configure if these credentials are allowed and for which accounts.
Writes coming soon
The current workbench is read-only. Next on our roadmap is making the workbench writeable, which will include a similar workflow to DoltHub where you can use cell buttons to generate SQL queries that change data and take you through the pull request workflow. This can make your data on your Hosted instance more accessible to people who may not be as familiar with SQL or Git.
Conclusion
Hosted DoltDB now has a SQL workbench that lets you view the data on your SQL server, including version control features like branches, commits, and diffs. If you have any feedback or features you'd like to see sooner rather than later, make a feature request on GitHub or reach out to us on Discord.