10th April 2024
6 min read
David Hazra
David Hazra is a professional software developer based in London
Caption: Databases on Kubernetes
Running databases on kubernetes hasn't always been the go-to solution, stateful applications in general have traditionally been difficult to run smoothly in environments like kubernetes where it should be expected for a node (or a few nodes) to unexpectedly become unavailable.
However things have changed, we now have traditional databases (which were initially designed for a single-node setup, making horisontal scaling not the default) evolve to perform well in cloud environments. We also have cloud-native databases which have been purpose built with scalibility in mind.
One of the great things about having database solutions in kubernetes is that you can pick and choose your database provider with ease. There is no rule that a cluster must only have a single database operator. In fact it is common practice for different applications who have different operational needs to deploy databases that are specific to their workload all in the same cluster. The difficulty being that each additional database provider requires domain-specific knowledge to maintain.
Kubernetes databases are usually deployed as CRDs for operators. This means that you deploy an operator in the cluster that enables you to create database CRDs which the operator reconciles into a functioning system. It provisions the PVCs, the statefulsets, everything you need.
Another important note is that a lot of databases have similar terminology to kubernetes. For example they have "Nodes" which represent a single instance of a database, but also represent a machine instance in kubernetes. There is also a cluster which represents a collection of nodes.
✔️ GitHub | ✔️ Docs | ✔️ Official Operator | ✔️ Helm Chart
The MySQL operator is an official operator made by Oracle, but it is fully open source. It can be installed most conviniently via helm:
helm repo add mysql-operator https://mysql.github.io/mysql-operator
helm repo update
helm install mysql-operator mysql-operator/mysql-operator --namespace mysql-operator --create-namespace
The underlying storage engine MySQL uses is InnoDB. Below we show a basic example of deploying this cluster to a kubernetes environment.
To create an InnoDB cluster, you first need to create an in-cluster user secret. In this instance the rootHost
represents the list of hosts that the created user can connect from. Here the %
represents a wildcard character.
apiVersion: v1
kind: Secret
metadata:
name: user-innodb-creds
namespace: mysql-database
stringData:
rootUser: "username"
rootHost: "%"
rootPassword: "password"
After the operator pod has come up successfully, we can create the database cluster itself by applying the following resource. In this case tlsUseSelfSigned
indicates the cluster should use self-signed TLS certificates for traffic between database nodes in the database cluster. instances
is the number of database replicas to create, and router.instances
sets the number of MySQL router replicas which are responsible for routing traffic to the correct database instance within the cluster.
apiVersion: mysql.oracle.com/v2
kind: InnoDBCluster
metadata:
name: mycluster
spec:
secretName: user-innodb-creds
tlsUseSelfSigned: true
instances: 3
router:
instances: 1
There are quite a few PostgreSQL operators available for kubernetes:
We will be focusing on the Zalando operator, as it's open-source and has the most github stars by a fraction (I know this isn't the best metric!).
NOTE: We are biased here, as we use this operator the most in our applications, and have been happy with it's performance.
✔️ GitHub | ✔️ Docs | ✔️ Helm Chart
To install the operator instelf we can use Helm.
helm repo add postgres-operator-charts https://opensource.zalando.com/postgres-operator/charts/postgres-operator
helm repo update
helm install postgres-operator postgres-operator-charts/postgres-operator -n postgres-operator --create-namespace
This operator comes with an optional nifty interface which can help create postgres cluster resources. It doesn't include all of the CRD parameters, but it can be useful:
# Install the optional operator UI
helm repo add postgres-operator-ui-charts https://opensource.zalando.com/postgres-operator/charts/postgres-operator-ui
helm repo update
helm install postgres-operator-ui postgres-operator-ui-charts/postgres-operator-ui -n postgres-operator-ui --create-namespace
# To access the UI locally (http://localhost:8081)
kubectl port-forward svc/postgres-operator-ui -n postgres-operator-ui 8081:8081
This postgresql cluster was created with the aide of the UI. It defines a postgresql 15 cluster with 1 replica, and provisions a persistent volume with 25Gi of storage (using the specified storage class). An admin-user
is created with access to the database db
. The operator will create a secret which contains the login information for this user.
kind: postgresql
apiVersion: acid.zalan.do/v1
metadata:
name: postgresql-database
namespace: database
labels:
team: acid
spec:
teamId: acid
postgresql:
version: "15"
numberOfInstances: 1
volume:
size: "25Gi"
storageClass: "storage-class-name"
users:
admin-user: []
databases:
db: admin-user
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 500m
memory: 500Mi
✔️ GitHub | ✔️ Docs | ✔️ Official Operator
CockroachDB can be installed in 2 ways: as a kubernetes operator as we've seen in the other databases (this is the way recommended in their docs), or we can deploy a database as a self-contained helm chart. We will be covering the former.
It should also be warned that according to their documentation the operator has only been tested with the Google Kubernetes Engine (GKE). However, they also have instructions for working with Amazon's Elastic Kubernetes Service (EKS), so I'd assume that it works there too.
The base operator is not installed via helm chart like other databases, but applied directly using kubectl
:
# Install the CRDs
kubectl apply -f https://raw.githubusercontent.com/cockroachdb/cockroach-operator/v2.12.0/install/crds.yaml
# Install the operator itself
kubectl apply -f https://raw.githubusercontent.com/cockroachdb/cockroach-operator/v2.12.0/install/operator.yaml
Once the operator pod has come up and is working, we can create an example cluster by applying the following resource. In this case nodes
refers to the number of database replicas to be created.
apiVersion: crdb.cockroachlabs.com/v1alpha1
kind: CrdbCluster
metadata:
name: database
spec:
dataStore:
pvc:
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: "25Gi"
volumeMode: Filesystem
resources:
requests:
cpu: 500m
memory: 2Gi
limits:
cpu: 2
memory: 8Gi
tlsEnabled: true
image:
name: cockroachdb/cockroach:v23.1.11
nodes: 3
Deploy Next.js on Kubernetes Step 1: Image Building & CI/CD
David Hazra
25th July 2024
Search Engine Optimization Part 1: How Does It Work?
Kai Cockrell
5th July 2024
Weights and Biases Website Design Review
David Hazra
6th June 2024
Aerospace Materials: A Journey Through Time
Daniel Bevan
10th May 2024
Visual Storytelling in Web Design
Daniel Bevan
27th April 2024
Visual Hierarchy: Controlling Your Users Focus
Kai Cockrell
19th April 2024
Gestalt Principles: Capturing Attention in Web Design
Kai Cockrell
12th April 2024
Linguistics and LLMs: Understanding Language
Kai Cockrell
11th April 2024
Inside UK Chip Designer ARM's Soaring Share Price
Daniel Bevan
19th February 2024
© 2024 White Crab Systems LTD. All rights reserved.