How to run DoltLab without egress
DoltLab is the self-hosted version of DoltHub, a web-based remote for your Dolt databases. In recent weeks, we had a DoltLab customers reach out to us looking to run their DoltLab instances from within a closed, internal network where egress traffic is heavily restricted or altogether blocked. This means that their DoltLab instance is prevented from making outbound http requests. Earlier versions of DoltLab would error if egress traffic was restricted.
Most often, DoltLab customers who require their instance to run without egress need their instance to comply with their company's internal data security policies. And, while both DoltHub and DoltLab do not transport remote data off-instance, when it comes to data security, it's understandable why companies prefer to play it safe by simply blocking all outbound requests.
So, after helpful collaboration with our customers about their requirements and use-cases, we released DoltLab v2.3.3 which supports running a DoltLab instance without egress access.
Interestingly, earlier versions of DoltLab made two unintentional egress calls that were removed in DoltLab v2.3.3
, but were tricky to track down.
The first of these unintentional egress calls occurred whenever a DoltLab Job was run. When the Job started, it logged the version of the Dolt binary contained in the Job. It did this by executing the dolt version command and in recent months, this command was updated to make an outbound http request to api.github.com
. It does this check if there's a newer Dolt binary available to download, and notifies the user if there is. But this is not relevant to DoltLab, so we disabled this Dolt feature in DoltLab v2.3.3
.
The second unintentional egress calls DoltLab made prior to v2.3.3
were to Stripe's API, via the inclusion of an HTML script the DoltLab DOM. DoltLab does not use Stripe, but DoltHub uses it to process payments. And, since both products share source code, this DoltHub-only script was being added, mistakenly, in DoltLab. Upon further investigation into this, we found that an import
of the stripe/stripe-js NPM package was injecting this script on DoltLab, whenever it detected that the stripe-js
script was missing from the DOM! This too has been fixed in DoltLab v2.3.3
.
In the remainder of today's blog I'll cover how to set up and run DoltLab v2.3.3
in a restricted environment without egress. Please note, that at the time of this writing DoltLab Enterprise requires egress access. DoltLab Enterprise makes egress calls to a licensing server in order to validate the license of the enterprise instance and authorize the use of its features. However, fully offline DoltLab Enterprise support is under construction and will be available in the coming weeks.
Prerequisites
As of DoltLab v2.3.3
, there are only two types of outbound http calls made by DoltLab.
First, a DoltLab instance emits first-party ("phone-home") metrics that let our team know how many instances are running in the wild. We use these metrics to help secure funding for DoltLab's ongoing development and support, so we encourage users to not disable them, though doing so is easy.
Second, a DoltLab instance needs to pull the container images for its various services from a public AWS ECR repository. If you've used DoltLab before, you know that it runs via Docker Compose, and when you start your DoltLab instance, the first thing you'll see is the service images being pulled to the host from public.ecr.aws
.
root@ip:/home/ubuntu/doltlab# ./start.sh
0b2a9c59ab673ef79d6eb1fd7c84c16b5fb5d5c952a6efc53f15eeb29c058ff3
Pulling doltlabdb (public.ecr.aws/dolthub/doltlab/dolt-sql-server:v2.3.3)...
v2.3.3: Pulling from dolthub/doltlab/dolt-sql-server
7478e0ac0f23: Pull complete
c013805c2f1c: Pull complete
aace6430adbe: Pull complete
4b1b141afe4e: Pull complete
b6b8e5d82846: Pull complete
84d93b369a57: Pull complete
4f4fb700ef54: Pull complete
3c5f26355296: Pull complete
616210399853: Extracting [=========> ] 7.471MB/38.67MB
3555462ef609: Download complete
Additionally, some of DoltLab's features, like Jobs will pull service images for the Job that is queued to run. This happens silently under-the-hood, and is orchestrated by DoltLab's main API service.
What's important to note here, is that to run a DoltLab instance on a host without egress, you'll need to perform two steps to prevent the instance from making both types of outbound calls.
Disabling metrics on DoltLab is simple, and we can do this later in the process with a small edit to DoltLab's installer_config.yaml. Avoiding outbound calls to public.ecr.aws
, on the other hand, requires you to pre-load service images onto your host, before you attempt to start the instance.
Let's walk through an example DoltLab v2.3.3 deployment to better illustrate what this entails.
If you're running an older DoltLab instance and are wanting to upgrade to v2.3.3
, be sure to stop your old instance using the ./stop.sh
script, before continuing.
DoltLab v2.3.3 without egress
Prior to the release of DoltLab v2.3.3
, the service images for DoltLab were only available from the public ECR repository. However, starting with v2.3.3
, we release all of DoltLab's service images in a single zip file. This enables you to load the images onto the host, so they won't need to be pulled from the public repository.
To do this, download the zip for DoltLab v2.3.3
and the accompanying zip for the service images, and then upload these zip files onto your DoltLab host. You can do this by downloading these files on a host that has egress access, saving the files on a physical drive, then uploading them to the DoltLab host from the drive.
# download zip files on a host with egress access
root@ip:/home/ubuntu# curl -LO https://doltlab-releases.s3.amazonaws.com/linux/amd64/doltlab-v2.3.3.zip
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 19.9M 100 19.9M 0 0 10.4M 0 0:00:01 0:00:01 --:--:-- 10.4M
root@ip:/home/ubuntu# curl -LO https://doltlab-releases.s3.amazonaws.com/linux/amd64/doltlab-service-images-v2.3.3.zip
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 3129M 100 3129M 0 0 20.2M 0 0:02:34 0:02:34 --:--:-- 21.9M
# upload zip files on a host without egress access using a physical drive
root@ip:/home/ubuntu# ls
doltlab-service-images-v2.3.3.zip doltlab-v2.3.3.zip
Please be aware that the service images zip file is quite large. For v2.3.3
it is 3.13 GB.
Once the zip files are uploaded to the DoltLab host, ensure that any older version of DoltLab has been stopped. Next, unzip the service images zip file.
root@ip:/home/ubuntu# unzip doltlab-service-images-v2.3.3.zip -d service-images
Archive: doltlab-service-images-v2.3.3.zip
inflating: service-images/doltremoteapi-server-v2.3.3.tar
inflating: service-images/dolthub-server-v2.3.3.tar
inflating: service-images/file-importer-v2.3.3.tar
inflating: service-images/dolt-sql-server-v2.3.3.tar
inflating: service-images/pull-merge-v2.3.3.tar
inflating: service-images/envoy-v1.28-latest.tar
inflating: service-images/dolthubapi-server-v2.3.3.tar
inflating: service-images/fileserviceapi-server-v2.3.3.tar
inflating: service-images/query-job-v2.3.3.tar
inflating: service-images/dolthubapi-graphql-server-v2.3.3.tar
Inside the unzipped file you'll see tarballs for each service DoltLab depends on. These files will need to be loaded into Docker, so DoltLab can use them.
To load them into Docker, cd
into the service-images
directory and use the docker image load command for each service file. Be sure not to omit any service file, and do not change the tags of the loaded images.
root@ip:/home/ubuntu/service-images# docker load < doltremoteapi-server-v2.3.3.tar
Loaded image: public.ecr.aws/dolthub/doltlab/doltremoteapi-server:v2.3.3
root@ip:/home/ubuntu/service-images# docker load < dolthub-server-v2.3.3.tar
Loaded image: public.ecr.aws/dolthub/doltlab/dolthub-server:v2.3.3
root@ip:/home/ubuntu/service-images# docker load < file-importer-v2.3.3.tar
Loaded image: public.ecr.aws/dolthub/doltlab/file-importer:v2.3.3
root@ip:/home/ubuntu/service-images# docker load < dolt-sql-server-v2.3.3.tar
Loaded image: public.ecr.aws/dolthub/doltlab/dolt-sql-server:v2.3.3
root@ip:/home/ubuntu/service-images# docker load < pull-merge-v2.3.3.tar
Loaded image: public.ecr.aws/dolthub/doltlab/pull-merge:v2.3.3
root@ip:/home/ubuntu/service-images# docker load < envoy-v1.28-latest.tar
Loaded image: envoyproxy/envoy:v1.28-latest
root@ip:/home/ubuntu/service-images# docker load < dolthubapi-server-v2.3.3.tar
Loaded image: public.ecr.aws/dolthub/doltlab/dolthubapi-server:v2.3.3
root@ip:/home/ubuntu/service-images# docker load < fileserviceapi-server-v2.3.3.tar
Loaded image: public.ecr.aws/dolthub/doltlab/fileserviceapi-server:v2.3.3
root@ip:/home/ubuntu/service-images# docker load < query-job-v2.3.3.tar
Loaded image: public.ecr.aws/dolthub/doltlab/query-job:v2.3.3
root@ip:/home/ubuntu/service-images# docker load < dolthubapi-graphql-server-v2.3.3.tar
Loaded image: public.ecr.aws/dolthub/doltlab/dolthubapi-graphql-server:v2.3.3
Once the images are loaded into Docker, you should be able to see them with the docker image ls command (although the "created" dates will be incorrect).
root@ip:/home/ubuntu/service-images# docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
public.ecr.aws/dolthub/doltlab/dolthub-server v2.3.3 759cfc4e4310 11 days ago 2.94GB
public.ecr.aws/dolthub/doltlab/dolthubapi-graphql-server v2.3.3 4d05fedef003 11 days ago 2.31GB
public.ecr.aws/dolthub/doltlab/dolt-sql-server v2.3.3 eab19a52eef8 11 days ago 243MB
envoyproxy/envoy v1.28-latest a37e999f9612 12 days ago 150MB
public.ecr.aws/dolthub/doltlab/pull-merge v2.3.3 f1059932c427 4 months ago 289MB
public.ecr.aws/dolthub/doltlab/file-importer v2.3.3 c13254c33ccb 4 months ago 296MB
public.ecr.aws/dolthub/doltlab/query-job v2.3.3 6a3e47fe8fd2 4 months ago 289MB
public.ecr.aws/dolthub/doltlab/doltremoteapi-server v2.3.3 d2546e7c90c4 N/A 224MB
public.ecr.aws/dolthub/doltlab/dolthubapi-server v2.3.3 923c5d21ca84 N/A 274MB
public.ecr.aws/dolthub/doltlab/fileserviceapi-server v2.3.3 96265469c401 N/A 156MB
Your DoltLab instance will now no longer attempt to pull any images from the public ECR repository, since they're already loaded on the host.
Now it's time to configure your new DoltLab instance.
Unzip the DoltLab zip file and cd
into the doltlab
directory.
root@ip:/home/ubuntu/service-images# cd ../
root@ip:/home/ubuntu# ls
doltlab-service-images-v2.3.3.zip doltlab-v2.3.3.zip service-images
root@ip:/home/ubuntu# unzip doltlab-v2.3.3.zip -d doltlab
Archive: doltlab-v2.3.3.zip
inflating: doltlab/smtp_connection_helper
inflating: doltlab/installer
inflating: doltlab/installer_config.yaml
root@ip:/home/ubuntu# cd doltlab
If you have a previous installation of DoltLab on this host, you can simply copy the installer_config.yaml
of your previous installation into this doltlab
directory, replacing the default one. Just be sure to edit the version
field of the old installer_config.yaml
to be v2.3.3
.
# installer_config.yaml
version: "v2.3.3"
# ...
Otherwise, if this is your first-time installing DoltLab, you can follow the steps on our Start DoltLab documentation page to edit the installer_config.yaml
for your particular setup. This entails defining the host
and specifying the passwords your instance should use.
Finally, you'll need to make one additional change to the installer_config.yaml
file which will prevent your instance from making the metrics-related egress calls we discussed earlier. Edit installer_config.yaml
once more and set the metrics_disabled
field to true
.
# installer_config.yaml
# ...
## First-party metrics can be disabled by setting `metrics_disabled: true`. Default is `false`.
metrics_disabled: true
# ...
Save your edits, and run the installer binary to generate DoltLab's static assets.
root@ip:/home/ubuntu/doltlab# ./installer
2024-10-01T21:22:43.488Z INFO cmd/main.go:554 Successfully configured DoltLab {"version": "v2.3.3"}
2024-10-01T21:22:43.488Z INFO cmd/main.go:560 To start DoltLab, use: {"script": "/home/ubuntu/doltlab/start.sh"}
2024-10-01T21:22:43.488Z INFO cmd/main.go:565 To stop DoltLab, use: {"script": "/home/ubuntu/doltlab/stop.sh"}
You can now start your v2.3.3
DoltLab instance by running the ./start.sh
script, and your instance will not be making any egress requests!
Conclusion
I want to extend a big thank you to the customer who worked closely with us to get this use case supported. We're always trying to improve our products and are so grateful when the community reaches out to let us know how we can better support them!
If you haven't heard yet, you can reach us anytime Discord. We'd love to chat with you and learn more about how you want to use Dolt and DoltLab.
Thanks for reading and don't forget to check out each of our cool products below:
- Dolt—it's Git for data.
- Doltgres—it's Dolt + PostgreSQL.
- DoltHub—it's GitHub for data.
- DoltLab—it's GitLab for data.
- Hosted Dolt—it's RDS for Dolt databases.