Intro
In this article, we will explore how to setup HarperDB clusters with Anthos followed by how to integrate HarperDB on AWS, GCP, and Azure with Terraform. (If you prefer to jump straight to the code, see the repo here.)
HarperDB is a data and application platform designed to be scalable and lightweight. It is a NoSQL database with a unique, flat design that enables it to manage enormous volumes of data with minimum setup and upkeep. HarperDB is also meant to be easy to use, with a RESTful API that allows a variety of programming languages to connect with the database with relative ease.
What is multi-cloud and why?
A multi-cloud architecture is a strategy for utilising several cloud computing services from different providers, as opposed to depending just on one. This allows enterprises to take advantage of the unique strengths and capabilities of each provider, while also ensuring redundancy and business continuity in the event of a service breakdown with one source.
Multi-cloud enables several benefits, including:
- Flexibility: Enables enterprises to utilise the finest services from many cloud providers for a variety of use cases, hence enhancing their adaptability. This can assist with optimising expenses, security, and performance.
- Cost Optimisation: Businesses may take advantage of a variety of pricing structures and services best suited to their workloads.
- Compliance and Security: Businesses are able to comply with a variety of rules that may apply to their data. They can also to limit the possibility of data breaches and other security issues.
- Business Continuity: Enables businesses to achieve high availability and disaster recovery by distributing workloads across various providers.
- Innovation: Enables enterprises to experiment with new services and technologies from a variety of providers, hence fostering innovation and digital transformation.
Multi-cloud is a strong solution for businesses seeking to optimize their cloud infrastructure and guarantee that their data is safe, accessible, and compliant.
HarperDB and Anthos
Anthos is a hybrid and multi-cloud platform from Google Cloud that allows users to upgrade their existing applications or develop new apps using the same open-source Kubernetes technology on-premises, in various clouds, and at the edge. This enables users to install and manage their apps across several environments without being restricted to a single cloud provider. By combining HarperDB with Anthos, you can simply deploy and maintain your database across several clouds and on-premises systems, while retaining access to HarperDB's unique capabilities. This can be particularly valuable for enterprises that want a highly scalable and readily maintained database, as well as the ability to deploy their applications across numerous settings.
With HarperDB and Anthos, it is simple to deploy your database to any environment that supports Kubernetes, including on-premises, Google Cloud, AWS, and Azure. Additionally, HarperDB's flat design makes it well-suited for usage with Kubernetes and containers due to its minimum setup and maintenance requirements. This enables you to install and operate your database in a containerized environment without worrying about the underlying infrastructure.
Overall, HarperDB plus Anthos is a potent combo that enables enterprises to manage and deploy their databases across numerous settings with ease, while still making use of the unique capabilities of HarperDB. Whether you are upgrading current applications or developing new ones, HarperDB and Anthos can assist you in maximizing the value of your data.
Example Use Case- Retail
An example of using HarperDB and Anthos together could be for a retail company that has a need for a highly scalable and easily manageable database to store its customer data and inventory information in real-time. The company currently has a large number of brick-and-mortar stores, as well as an online store, and they want to be able to easily manage data in one central location.
To achieve this, the company decides to use HarperDB as its central database management system. HarperDB's distributed architecture makes it well-suited for handling large amounts of data globally and its RESTful API makes it easy to interact with the database from a variety of programming languages.
Next, the company decides to use Anthos to deploy and manage HarperDB across its different environments. This includes deploying the database on-premises (or at the edge) in their brick-and-mortar stores, as well as in the Google Cloud for their online store. Using Anthos, the company can easily manage HarperDB in a containerized environment and take advantage of the unique benefits of each environment.
With HarperDB and Anthos, the retail company is able to easily manage and deploy its customer data and inventory information across all of its different environments in real-time. This allows them to have a single source of truth for their customer data and inventory information, making it easier for them to make data-driven decisions. Additionally, the company is able to take advantage of the scalability and ease of use of HarperDB, while still being able to manage its database in a containerized environment.
In summary, HarperDB and Anthos together is a powerful combination that can help organizations easily manage and deploy their databases across multiple environments, while avoiding the skyrocketing costs that come with other solutions.
Let's set it up
Setting up HarperDB with Anthos involves several steps and configurations. Here is a detailed explanation of the process:
- Create a Kubernetes cluster on Anthos: You can use the GKE On-Prem or GKE on Google Cloud depending on whether you need to run the cluster on-premises or in the cloud. You can use the command gcloud container clusters create to create a cluster on GKE, specifying the name and location of the cluster, the number of nodes, and the version of Kubernetes.
- Install Helm: Helm is a package manager for Kubernetes that allows you to easily install and manage applications on a Kubernetes cluster. You can use the command gcloud components install kubectl to install kubectl and curl https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash to install Helm on your cluster.
- Deploy HarperDB: Once you have your Kubernetes cluster set up, you can use the HarperDB Helm chart to deploy HarperDB to the cluster. You can use the command helm install harperdb ./harperdb to deploy HarperDB, specifying the name of the release. Building the helm chart: https://faun.pub/running-harperdb-in-kubernetes-in-one-command-8c87e2788eb6
- Configure the HarperDB pod: Once HarperDB is deployed, you will need to configure the HarperDB pod with the necessary connection information. Verify the connection: Once the pod is restarted, you can use the command kubectl get pods to check the status of the pod, and make sure it's running. You can also use the command kubectl logs to check the logs of the pod and look for any errors.
- Scale the HarperDB deployment: HarperDB can be easily scaled up or down as needed. You can use Kubernetes replicas to scale the HarperDB deployment. With replicas, you can specify how many copies of the HarperDB pod you want to run in your cluster. You can use the command kubectl scale deployment to scale the deployment, specifying the name of the deployment and the number of replicas.
- Monitor and manage the HarperDB deployment: Once HarperDB is running on your cluster, you can use Kubernetes tools such as kubectl and Prometheus to monitor and manage your HarperDB deployment. These tools allow you to check the health of your HarperDB pods, see the logs, and troubleshoot any issues that may occur.
- Connect to the HarperDB instance: The ingress endpoint provides the connection point to the HarperDB instance and the RESTful API allows you to perform CRUD operations. You can use the command kubectl get ingress to get the ingress endpoint and use it to connect to the HarperDB instance.
- Connecting to the HarperDB instance involves using the ingress endpoint and the HarperDB RESTful API to interact with the database. Here are the general steps to connect to the HarperDB instance:
a) Get the ingress endpoint: You can use the command kubectl get ingress to get the ingress endpoint for your HarperDB deployment. The ingress endpoint is a URL that provides access to the HarperDB instance.
b) Use the RESTful API: Once you have the ingress endpoint, you can use the REST API to interact with your HarperDB instance. This allows you to perform CRUD operations, such as creating, reading, updating, and deleting data in the database. You can use a tool such as Postman or cURL to send HTTP requests to the ingress endpoint and interact with the HarperDB instance.
c) Authenticate: Depending on the security configuration of your HarperDB instance, you may need to authenticate to access the RESTful API. You can use the username and password that you have set in the configuration process.
d) Test the connection: Once you have connected to the HarperDB instance, you can test the connection by sending a simple request, such as a GET request to retrieve data from the database. This will confirm that you are able to connect to the HarperDB instance and interact with the data.
e) Start storing, retrieving, and modifying data: After you have confirmed the connection, you can start storing, retrieving, and modifying data in the HarperDB instance using the REST API.
Now, let's explore the multi-cloud setup with Terraform.
Terraform lets users write infrastructure across cloud providers. It's great for multi-cloud management, and enables users to develop reusable modules for popular cloud provider components. Terraform users may utilise variables, conditions, and loops to modularize and dynamically code infrastructure. Terraform state management can track infrastructure state and enable rollbacks and catastrophe recovery.
Terraform automates provisioning, ensures consistency across cloud providers, and improves infrastructure management in multi-cloud systems. (If you prefer to jump straight to the code, see the repo here.)
Setup
To set up a multi-cloud Anthos HarperDB cluster using Terraform, you would need to write Terraform configuration files that define the resources you want to create, and use the provider-specific modules to provision the resources on each cloud provider. Additionally, you would need to set up the necessary networking and security configuration to allow the cluster nodes to communicate across the different clouds.
Prerequisites
Before we begin, there are a few prerequisites you will need to have in place:
- You will need to have accounts set up with GCP, AWS, and Azure and have the necessary credentials to access them.
- You will need to have the Terraform CLI installed on your machine.
- You will need to have basic knowledge of containerization and container orchestration.
Step 1: Setting up the Terraform Configuration
The first step in spinning up a multi-cloud HarperDB cluster is to set up the Terraform configuration. We will create a directory called harperdb-cluster and within that, we will create a file called main.tf. This file will contain the main Terraform configuration that defines the resources we want to create.
First, we will define the providers for GCP, AWS, and Azure.
Next, we will define the HarperDB cluster nodes.
When using Terraform to spin up a multi-cloud HarperDB cluster across GCP, AWS, and Azure using Anthos, you can create and use the HarperDB Helm chart to deploy and manage the HarperDB cluster on each of the cloud providers.
For GCP:
You can use Terraform to create a GKE cluster and use the kubectl command to install the HarperDB Helm chart:
For AWS:
You can use Terraform to create an EKS cluster and use the kubectl command to install the HarperDB Helm chart:
For Azure:
You can use Terraform to create an AKS cluster and use the kubectl command to install the HarperDB Helm chart:
Step 2: Setting up the Anthos Configuration
Before you can use Anthos to spin up a multi-cloud HarperDB cluster, you will need to set up the necessary configuration for Anthos. This will include creating a GKE cluster in GCP, which will act as the management cluster. Then, you'll need to install the Anthos Config Management component and configure it to manage the other clusters in AWS and Azure.
You can use Terraform to create the GKE cluster and install the Anthos Config Management component with the following configuration:
Step 3: Setting up the Networking
Once the management cluster is set up, you can connect the other clusters in AWS and Azure to it. We will need to set up the necessary networking configuration to allow the HarperDB cluster nodes to communicate across the different clouds.
One option to set up communication between the clusters is to use the Kubernetes native feature called Cluster Federation. Cluster Federation allows you to spread your workloads across different clusters and different cloud providers by creating a single control plane for multiple clusters. By using this feature, you can create a single logical view of your entire infrastructure, regardless of where the clusters are running. To set this up, you will need to deploy a federation control plane, and configure each cluster to join the federation.
Another option is to use a Kubernetes service mesh like Istio. A service mesh is a configurable infrastructure layer for a microservices application that makes communication between service instances flexible, reliable, and fast. With Istio, you can set up communication between different clusters running in different cloud providers by configuring a service mesh in each cluster and connecting the meshes together.
A third option is to use a service discovery solution like Consul. Service discovery is the process of figuring out how to connect to a service. Consul is a tool that allows services to register themselves and discover other services running in different clusters and cloud providers. By configuring Consul in each cluster, you can set up communication between the clusters by allowing services to discover and connect to each other.
It's important to note that setting up communication across different clusters running in different cloud providers is a complex task and it's strongly recommended to use a managed service like Anthos.
It's important to set up a secure connection between the different cloud providers. One way to do this is by using a Virtual Private Network (VPN) to create a secure connection between the networks.
You can use Google Cloud VPN to create a VPN connection between GCP and AWS. This can be done by creating a VPN gateway on GCP and a VPN customer gateway on AWS. Once the gateways are set up, you can create a VPN tunnel between them.
You can use Terraform to create the VPN gateways and the VPN tunnel with the following configuration:
This is a sample Terraform code that creates VPN connections between GCP and AWS, Azure and GCP, and Azure and AWS using the google_compute_vpn_tunnel and azurerm_virtual_network_gateway_connection resources respectively. The VPN connections are established between the VPN gateways of each cloud provider, and the peer IP addresses, shared secrets, and traffic selectors are configured.
You can also use other VPN solutions such as OpenVPN, in this case, you will need to use the provider module for OpenVPN and configure the VPN connection.
Step 4: Connecting GKE, ECS, and AKS Clusters
For GKE, you will need to configure the kubeconfig on your local machine to connect to the management cluster. You can use Terraform to execute a shell provisioner to achieve this:
For EKS, you will need to create a kubeconfig for the management cluster and use the kubefed command to join the EKS cluster to the management cluster.
For AKS, you will need to create a kubeconfig for the management cluster and use the kubefed command to join the AKS cluster to the management cluster.
Step 5: Deploying the HarperDB Cluster
Once the Terraform configuration and networking and security settings are set up, we can deploy the HarperDB cluster. To do this, we will run the following command in the harperdb-cluster directory:
This will create the HarperDB cluster across GCP, AWS, and Azure, and set up the necessary networking and security to allow the cluster nodes to communicate across the different clouds.
Conclusion
In this article, we explored how to utilise Terraform to deploy a HarperDB cluster across GCP, AWS, and Azure using HarperDB. We have demonstrated how to configure Terraform, as well as the networking and security settings required for cluster nodes to connect between clouds. With Terraform, we are able to deploy and manage our infrastructure as code, making it simple to duplicate and grow our HarperDB cluster across several cloud providers.