Automate Your Database Workflow with the New DoltHub API
We are excited to announce the release of the new DoltHub API, designed to make it easier to manage and collaborate on databases programmatically. The DoltHub API offers a range of features that can help streamline your work and improve productivity.
We had a number of users, especially during bounties, programming against the "unofficial" GraphQL API to automate things like creating Pull Requests. We've designed the new "official" API with those use cases in mind.
Key Features of the DoltHub API
The DoltHub API offers a range of features that can help you automate tasks and work more efficiently with your databases. Here are some of the key features:
-
Creating databases: With the new DoltHub API, you can create new databases programmatically using a
POST
request to thedatabase
endpoint. This can help you automate the process of creating and setting up new databases, like when you are using DoltHub as a replication broker. -
Creating pull requests: You can create pull requests on your databases, to automate data insert and update workflows. You can open a pull request by making a
POST
request to the{owner}/{database}/pulls
endpoint. -
Adding pull request comments: The API also enables you to add comments to pull requests programmatically, allowing automated tools to post comments if your data quality control checks fails.
-
Merging pull requests: Finally, merge pull requests programmatically to help streamline the process of integrating changes into your databases after automated tests pass. You can merge a pull request by making a
POST
request to the{owner}/{database}/pulls/{pull_id}/merge
endpoint, and get the merge operation status by sending aGET
request to the same endpoint with theoperation_name
.
How to use the DoltHub API
To get started with the DoltHub API, you first need to create a DoltHub account. You'll need an API token to authenticate your API requests, which you can create in your account settings. The API is built using standard REST principles, and all responses are returned in JSON format.
Creating Databases
To create a new database, simply make a POST
request to the database
endpoint, providing ownerName
, repoName
and visibility
the new database.
Here's an example of how to create a new database called museum-collections
under the organization dolthub
using an authorization token.
import requests
url = 'https://www.dolthub.com/api/v1alpha1/database'
headers = {
'authorization': 'token YOUR_API_TOKEN'
}
data = {
"description": "Records from museums around the world.",
"ownerName": "dolthub",
"repoName": "museum-collections",
"visibility": "public"
}
response = requests.post(url, headers=headers, json=data)
The JSON response returned:
{
"status": "Success",
"description": "Records from museums around the world.",
"repository_owner": "dolthub",
"repository_name": "museum-collections",
"visibility": "public"
}
Creating Pull Requests
To create a pull request on a database, make a POST
request to the {owner}/{database}/pulls
endpoint.
Here is an example of opening a pull request on the museum-collections
database with data from the Los Angeles County Museum of Art (LACMA). This data was added to the lacma
branch on a fork of the dolthub/museum-collections
database, whose owner is now liuliu
. We would like to eventually merge the lacma
branch from liuliu/museum-collections
into the main
branch of dolthub/museum-collections
.
import requests
from_owner_name, to_owner_name, database_name = "liuliu", "dolthub", "museum-collections"
url = "https://www.dolthub.com/api/v1alpha1/{}/{}/pulls".format(to_owner_name, database_name)
headers = {
'authorization': 'token YOUR_API_TOKEN'
}
data = {
"title": "LACMA data",
"description": "Records from the Los Angeles County of Museum.",
"fromBranchOwnerName": from_owner_name,
"fromBranchRepoName": database_name,
"fromBranchName": "lacma",
"toBranchOwnerName": to_owner_name,
"toBranchRepoName": database_name,
"toBranchName": "main"
}
response = requests.post(url, headers=headers, json=data)
A successful JSON response includes a pull_id
:
{
"status": "Success",
"title": "LACMA data",
"description": "Records from the Los Angeles County of Museum.",
"from_owner_name": "liuliu",
"from_repository_name": "museum-collections",
"from_branch_name": "lacma",
"to_owner_name": "dolthub",
"to_repository_name": "museum-collections",
"to_branch_name": "main",
"pull_id": "66"
}
Creating Pull Request Comment
To make a comment on a pull request, like when our automated data quality tests pass, make a POST
request to the {owner}/{database}/pulls/{pull_id}/comments
endpoint.
Here is an example of adding a pull request comment using an authorization token.
import requests
url = 'https://www.dolthub.com/api/v1alpha1/dolthub/museum-collections/pulls/66/comments'
headers = {
'authorization': 'token YOUR_API_TOKEN'
}
data ={
"comment": "The pull request looks good!"
}
response = requests.post(url, headers=headers, json=data)
The JSON response:
{
"status": "Success",
"repository_owner": "dolthub",
"repository_name": "museum-collections",
"pull_id": "66",
"comment": "The pull request looks good!"
}
Merging Pull Request
To merge a pull request, make a POST
request to the {owner}/{database}/pulls/{pull_id}/merge
endpoint.
Here is an example of merging pull request #66 on a database museum-collections
using an authorization token. Note that the merge operation is asynchronous and creates an operation that can be polled to get the result.
import requests
url = 'https://www.dolthub.com/api/v1alpha1/dolthub/museum-collections/pulls/66/merge'
headers = {
'authorization': 'token YOUR_API_TOKEN'
}
response = requests.post(url, headers=headers )
A successful JSON response includes the operation_name
:
{
"status": "Success",
"repository_owner": "dolthub",
"repository_name": "museum-collections",
"pull_id": "66",
"operation_name": "operations/b09a9221-9dcb-4a15-9ca8-a64656946f12"
}
To poll the operation and check its status, you can use the operation_name
from the returned response of the merge request to query the API. Once the operation is complete, the response will contain a job_id
field indicating the job that's running the merge, as well as other information such as the repository_owner
, repository_name
, and pull_id
.
Keep in mind that the time it takes for the merge operation to complete can vary depending on the size of the pull request and the complexity of the changes being merged.
import requests
url = 'https://www.dolthub.com/api/v1alpha1/dolthub/museum-collections/pulls/66/merge'
headers = {
'authorization': 'token YOUR_API_TOKEN'
}
data={
"operationName": "operations/b09a9221-9dcb-4a15-9ca8-a64656946f12"
}
response = requests.get(url, headers=headers, json=data)
The JSON response includes a job_id
field:
{
"status": "Success",
"operation_name": "operations/b09a9221-9dcb-4a15-9ca8-a64656946f12",
"done": true,
"job_id": "1",
"repository_owner": "dolthub",
"repository_name": "museum-collections",
"pull_id": "66"
}
you can view the job status on DoltHub website, we will provide an API endpoint for query jobs in the future.
Conclusion
The new DoltHub API makes it easier than ever to work with your databases programmatically. Whether you're a developer, data scientist, or one of our bounty hunters, the DoltHub API has you covered. See our API documentation for more information. Try it out today and let us know what you think on Discord.