Deploy HarperDB & StackPath to Reduce Latency at the Edge




In all of my projects with HarperDB, I’ve always wanted to put it on an edge compute platform to see how fast it is and how much of a difference it could make. Most of the cloud computing platforms I’ve used so far have not had a location near me, and even when they do, they’re not super responsive- making my speed tests useless. I stumbled upon StackPath which takes care of a lot of the problems I face with the other cloud computing platforms. It has edge locations near me (and even a future one planned in the next town over!), so if it can reliably deliver the container workload, I’m very much looking forward to using it more.

To test out StackPath, I’d like to launch a container Workload in my local PoP and see what the median request time is as well as launch a HarperDB Instance so I can see how quick the Custom Functions can be when placed on the edge. To act as the test project, I’ll use some movie quotes as the data source and have a route where Users can request a random movie quote.

What is HarperDB

HarperDB is a product that allows you to store, process, or query data locally or in a cloud Instance like we will today. The database uses a dynamic schema that supports both SQL and NoSQL operations, so no matter how you like interacting with your database, HarperDB supports it. One other great feature is the Custom Functions, which allows you to add your own routing and controller code to the Fastify API under the hood. Finally, it can also serve a static UI so your entire application stack can be hosted on one instance.

I’ve covered signing up for HarperDB in a previous article in depth if you are interested. HarperDB is one of my favorite technologies, especially as a full-stack developer since it makes deploying and developing projects much easier with its rich feature set. They also have an active community on Slack where you can go to get questions answered if you run into any trouble.

What is StackPath

StackPath provides a way to host applications, data, or whatever you would need in geographically close locations to your Users. Instead of hosting something across the coast, StackPath has many edge locations in densely populated areas and the surrounding area so the latency is greatly reduced.

I was thoroughly impressed by StackPath’s response time, using the edge location closest to me I was able to achieve a median request time of 46ms versus 100ms+ I saw using popular cloud computing platforms. Depending on your geographic location, this could be even lower as my closest edge location was in the next state over. When I tested this in San Francisco, I was able to achieve a median request time of 26ms to the PoP in San Jose (even using public WiFI).

While I’ll only use the containers feature of the Workload, StackPath offers numerous services designed to help lower the latency and take a lot of the stress out of running a cluster. They offer everything from object storage, a CDN, to DNS, and much more.

Workloads

Workloads are where you’ll run your application or service, and StackPath currently offers two types of Workloads: container and VM. Workloads are tied to a specific target but can launch Instances in multiple PoPs (Point of Presence) and vary the auto-scaling settings if desired. You could for example have different autoscaling settings for USA-based Workloads and have similar settings in other regions.

Workloads can also differ in Instance size, as not all applications come in the same size. Even the smallest size (SP-1) has 1 vCPU and 2GB of RAM so it should be plenty for most workloads, especially if you are going for a highly available setup where there will be multiple containers per PoP.

Creating our first Workload

See steps on getting started with StackPath here. From the left-hand side menu, select ‘Edge Compute’ and click ‘Create Workload’. I’ll name my Workload ‘HarperDB’, with a slug of ‘hdb-test’. For the Workload type, ensure ‘Container’ is selected and for the image name enter ‘harperdb/harperdb’. This will select the latest version of HarperDB but you could always supply a specific version if you wanted.

Setting Workload settings

StackPath luckily splits up non-secure and secure environment variables so we can keep our secrets a secret. In the non-secure environment variables, we’ll configure our instance and usernames but then pass our passwords in the secret environment variables list.

Last but not least, we need to expose HarperDB’s ports for the API, Custom Functions, and Clustering (if required). Under the environment variables section, click the ‘+’ button a few times and add an entry for ports: 99259926, and 9927.

Selecting a Workload spec

StackPath offers five workload sizes currently ranging from 1 vCPU with 2GB of RAM up to 8 vCPUs with 32GB of RAM. SP-1. The smallest spec is more than enough for testing today, so I’ll proceed with that. Finally, select a PoP you would like to use and set its name— I selected Dallas as it was the nearest one to me. We could also mount persistent storage here, which would be useful for production workloads.

After configuring your PoPs and other settings, we can proceed with creating the workload. On the Quick Overview page, we can see our settings and our non-secure environment variables as well as where we deployed our workload.

After the Workload has finished deploying, we can click on the name and see the details for the workload such as the metrics, logs, and network interface details, and grab the public IP to get started configuring.

Configuring HarperDB

Now with our Workload launched, we can configure our HarperDB Instance and start sending data to it, or even connect it to another existing HarperDB Instance via the Clustering feature so they can share data, as I covered in my previous article featuring HarperDB. To make this easy, we can add our new Instance to the HarperDB Studio and manage it that way. From the Instances page, click ‘Register User-Installed Instance’ and proceed forward.

I’ll name my Instance ‘stackpath-node’ and enter the credentials from the environment variables earlier. You can grab the IP Address from the StackPath Workload Instance list and the port will be 9925. You’ll have to accept the self-signed certificate used automatically and then you can proceed to the next screen where you can select the free tier size and then accept HarperDB’s ToS and add the instance.

After all is said and done, your Instance list should now contain the StackPath Instance we just set up. From here, we can click on the Instance to add a new schema, set up a Custom Function, or enable Clustering for example.

Deploying a Project

Now with the Instance added, we have a few options for loading a project. We can either deploy it manually to the disk, deploy a project from another HarperDB Instance (my personal favorite), or upload the project via the API. Since you may not already have another Instance, I’ve prepared a script you can use to upload a project via the API.

Within the scripts folder in the project repository, you’ll find a Node script I’ve prepared called deploy-custom-function.js. At the top of the script are some hard-coded configuration values we’ll have to adjust before we can run the script. I’ll edit mine to match the information for the container like so:

// Required configuration
const
HDB_INSTANCE_IP = '151.139.63.23'
const HDB_INSTANCE_PORT = 9925
const HDB_USERNAME = 'clusteradm'
const HDB_PASSWORD = '...'
const HDB_PROJECT_NAME = 'quotes'

After you have modified the script to match your Instance details, we can run the script via the command line like so:

$ node deploy-custom-functions.js
Found a total of 2 files to add to the archive.
Deploying the project to HarperDB, please wait...
Deployment has finished - Successfully deployed project: quotes

After the script finishes, the project will be uploaded to the Instance and we can proceed back to HarperDB Studio. Click ‘Functions’ on the top navigation bar and you should see the project we just uploaded. To test out the project, we first need to add some data and the schema required. Click over to ‘Browse’ and add a schema named ‘quotes’, and a table named ‘movies’ with the hash attribute of ‘id’, as I’ve done here:

After the table is added, we’ll point HarperDB to the CSV file we want to load and it’ll do the rest for us! Click the ‘Bulk Upload’ icon in the top-right, drop in the link to the raw CSV, and click ‘Import from URL’. After the import has finished, you should be redirected back to the Browse screen and see all of the data you just imported.

Finally, let’s test out the routes and project and see how it works! To do this, you can use Curl or Postman. I’ll use Curl for this simple example:

$ curl -k https://151.139.63.23:9926/quotes/single
{"year":1986,"__createdtime__":1668389235149,"__updatedtime__":1668389235149,"movie":"TOP GUN","id":94,"quote":"I feel the need - the need for speed!"}

Conclusion

Overall, I was super happy with the StackPath test. HarperDB made deploying a test project much easier as well, and if this was a production workload, we could enable clustering in a few clicks. After testing the median request times, the speeds I was able to achieve to local PoPs were impressive and truly shows the power of edge computing. I am very interested to see future feature plans for StackPath, or new PoP locations — I’ll definitely keep an eye out for when they launch a PoP in my hometown.

Resources