Running a Production Database
The database is the heart of most applications. It's where the data that drives your web applications lives. It's where your users' data is stored. The data in your database may be used in countless ways such as analytics, machine learning, and reporting. It's important that your database is reliable, secure, and performant. Whether you choose a NoSQL store, or a traditional relational database, many of the components you need to go to market with a production database are the same. In this post we'll discuss some of the components you need to run a production database and how Hosted Dolt can help you get to market faster.
If you decide to run this on your own hardware there will be additional steps to procure and setup the hardware, but we'll be skipping over that and assuming a cloud deployment, though much of the information here will apply to on-premises deployments as well.
The Components of a Production Database
The Database
The first component of a production database is the database itself. There are many different types of databases, but regardless of which database software you choose you will need to set up hosts, install the software, configure it, and manage it. There are a handful of cloud platforms which make it easy to get instances stood up where you want them. You could also use something like kubernetes to manage your database instances.
With instances stood up, you'll need to install and configure the database software. This step will vary depending on which database software you choose. Some databases can be installed using a package manager like apt, brew, etc. Others will require you to download and build the software from source. See the documentation for your database to learn how to install it.
Connectivity
Now that you have instances running your database software, they need to be reachable from your application, and for whatever other processes need to access the database. This means you need to configure networking. The steps vary here between cloud providers, but in general you'll need a way to get a static address for your database, and a way to get connections routed to that address. In AWS you will need to manage security groups and network access control lists. In GCP you will need to manage firewall rules. In Azure you will need to manage network security groups. In kubernetes you will need to manage ingress rules. In all cases you will need to manage DNS records as well.
Security
Once you have your database instances stood up and reachable, you need to secure them. This means you need to configure authentication and authorization. Additionally, you need to configure encryption for data in transit and at rest. To set up encryption for data in transit you will need to configure TLS certificates, and you'll need to manage the rotation of those certificates.
Monitoring
Now that you have your database stood up, reachable, and secured, you need to monitor it. You will want dashboards with metrics that allow you to see what is going on with your database, and you'll want to be alerted when things go wrong. Usually this means sending metrics to something like Prometheus or a cloud service like AWS's CloudWatch. You'll also want to set up alerts to notify you when things go wrong. For alerting, you'll need a service like PagerDuty, or OpsGenie integrated with your metrics.
Logging
Whether you are debugging issues, or just want to see what's going on with your instance, you'll want access to your database's logs. Often this means configuring your database instances to send logs to a central location the ELK stack, or a cloud service like CloudWatch. Additionally, you'll need to configure log rotation so the log files on your instances don't grow indefinitely.
Backups
Now that you have your database stood up, reachable, secured, monitored, and pushing logs to a central location, you need to handle irrecoverable failures. This means you need to set up backups. You'll need to configure backups to run on a schedule, and you'll need to configure retention policies for those backups. You'll also need to configure a way to restore from backups.
Replication
So far we've stood up a database and configured it to be reachable, secured, monitored, logged, backed up for disaster recovery... but what about high availability? What happens if your database instance goes down? You don't want to have to recover from a backup every time your cloud instance goes down. You need to set up replication. This means you need to do everything we've already talked about above for each replica you want to have. You'll need to configure replication so that your instances can speak with each other. It's possible you want to increase throughput by having read replicas, or you may want to be able to promote one of your replicas to be the primary instance in the event of a failure. You'll need to configure replication to handle these scenarios based on your needs.
Updates
You now have a database cluster setup. You've installed and configured numerous pieces of software, but how are you going to upgrade each piece of software? Whose responsibility is it to make sure that the software is up-to-date, and that you are not running vulnerable versions of software? You'll need to set up a manual or automated process to handle this.
Incidents
Your database is finally ready to go. You've done everything you can to make sure it's reliable, but things will still go wrong. You need to have a process in place to handle incidents, and you need people that can debug and resolve issues when they arise. This means you'll need an on call loop and an escalation policy.
The Alternative
As you can see, there are countless steps involved in setting up a production database. It is a costly, and time-consuming endeavor. It's also a distraction from your core business. You could spend months setting up your database, or you could use a managed database service like Hosted Dolt, Amazon RDS, or Cloud Bigtable and get to market faster.
Hosted Dolt handles all of the above for you. It's a relational database that you can use like any other relational database, but it's also a versioned database that you can use like a Git repository. It's a database that you can use to collaborate with your team, and track changes to your data over time.
Conclusion
Hosted Dolt is a product we are committed to and has a long roadmap of features and improvements. In addition to all the requirements of a production database, we offer features that make it easy to collaborate with your team such as the sql workbench, the ability to push and pull your database from DoltHub, and pull requests. Hosted Dolt starts at $50 a month and provides a tremendous amount of value for the price. If you are interested in learning more come talk to us on Discord, and if you would like to try it visit https://hosted.doltdb.com to get started.