A few months ago, I wrote about how to export HarperDB metrics to Prometheus by querying the `system_information` endpoint and sending those metrics to Prometheus Push Gateway. While this was a valid solution, it required multiple moving parts (i.e., a separate server to query metrics, Push Gateway, and Prometheus server). Recently the HarperDB team came out with the official Prometheus Exporter via custom functions so I decided to give it a spin.
Before diving into the solution, let’s give a brief overview of what Prometheus is and why the new exporter matters. Prometheus is a popular time series database and monitoring solution that is often used in cloud-native applications. Prometheus’s metric format has largely become a standard used in most monitoring solutions.
Previously, HarperDB did not have support for Prometheus and users had to rely on Harper Studio for monitoring. This was obviously not an ideal solution for users who use 3rd party observability tooling or needed to integrate with an incident response stack like Pagerduty. The new Prometheus Exporter provides a way for HarperDB users to integrate with enterprise-grade observability stacks and not rely on HarperDB Studio.
Docker Compose Setup
For simplicity, we’ll be utilizing Docker Compose in this example, but you can run this in any environment (just make sure to modify the Prometheus configuration accordingly). We’ll run HarperDB and Prometheus server like before:
There’s nothing special here besides turning on custom functions and also running them on the same Docker Network to leverage built in service discovery.
The configuration file for Prometheus looks like this:
Note that in the previous demo, we were going to localhost as we were running the exporter outside Docker but here we are scraping custom functions on HarperDB directly.
Custom Function Setup
Next, we need to install HarperDB Prometheus Exporter.
Then navigate to that directory and run `npm install`.
Restart the Docker container to make custom functions take effect.
Looking at the Code
The code for the exporter is pretty straight forward. It uses the `prom-client` library to register the default Node.js metrics as well as some custom gauges:
const Prometheus = require('prom-client');
Prometheus.collectDefaultMetrics();
const puts_gauge = new Prometheus.Gauge({name: 'harperdb_table_puts_total', help: 'Total number of non-delete writes by table', labelNames: ['database', 'table']})
const deletes_gauge = new Prometheus.Gauge({name: 'harperdb_table_deletes_total', help: 'Total number of deletes by table', labelNames: ['database', 'table']})
const txns_gauge = new Prometheus.Gauge({name: 'harperdb_table_txns_total', help: 'Total number of transactions by table', labelNames: ['database', 'table']})
const page_flushes_gauge = new Prometheus.Gauge({name: 'harperdb_table_page_flushes_total', help: 'Total number of times all pages are flushed by table', labelNames: ['database', 'table']})
const writes_gauge = new Prometheus.Gauge({name: 'harperdb_table_writes_total', help: 'Total number of disk write operations by table', labelNames: ['database', 'table']})
const pages_written_gauge = new Prometheus.Gauge({name: 'harperdb_table_pages_written_total', help: 'Total number of pages written to disk by table. This is higher than writes because sequential pages can be written in a single write operation.', labelNames: ['database', 'table']})
const time_during_txns_gauge = new Prometheus.Gauge({name: 'harperdb_table_time_during_txns_total', help: 'Total time from when transaction was started (lock acquired) until finished and all writes have been made (but not necessarily flushed/synced to disk) by table', labelNames: ['database', 'table']})
const time_start_txns_gauge = new Prometheus.Gauge({name: 'harperdb_table_time_start_txns_total', help: 'Total time spent waiting for transaction lock acquisition by table', labelNames: ['database', 'table']})
const time_page_flushes_gauge = new Prometheus.Gauge({name: 'harperdb_table_time_page_flushes_total', help: 'Total time spent on write calls by table', labelNames: ['database', 'table']})
const time_sync_gauge = new Prometheus.Gauge({name: 'harperdb_table_time_sync_total', help: 'Total time spent waiting for writes to sync/flush to disk by table', labelNames: ['database', 'table']})
const thread_count_gauge = new Prometheus.Gauge({name: 'harperdb_process_threads_count', help: 'Number of threads in the HarperDB core process'})
const harperdb_cpu_percentage_gauge = new Prometheus.Gauge({name: 'harperdb_process_cpu_utilization', help: 'CPU utilization of a HarperDB process', labelNames: ['process_name']});
// eslint-disable-next-line no-unused-vars,require-await
module.exports = async (server, { hdbCore, logger }) => {
server.route({
url: '/metrics',
method: 'GET',
handler: async (request, reply) => {
request.body = {
operation: 'system_information',
attributes: ['database_metrics', 'harperdb_processes', 'threads']
};
let system_info = await hdbCore.requestWithoutAuthentication(request);
thread_count_gauge.set(system_info.threads.length);
if(system_info.harperdb_processes.core.length > 0){
harperdb_cpu_percentage_gauge.set({process_name: 'harperdb_core'}, system_info.harperdb_processes.core[0].cpu);
}
system_info.harperdb_processes.clustering.forEach(process_data=>{
if(process_data.params.endsWith('hub.json')){
harperdb_cpu_percentage_gauge.set({process_name: 'harperdb_clustering_hub'}, process_data.cpu);
} else if(process_data.params.endsWith('leaf.json')){
harperdb_cpu_percentage_gauge.set({process_name: 'harperdb_clustering_leaf'}, process_data.cpu);
}
});
for (const [database_name, table_object] of Object.entries(system_info.metrics)) {
for (const [table_name, table_metrics] of Object.entries(table_object)) {
const labels = { database: database_name, table: table_name };
puts_gauge.set(labels, table_metrics.puts);
deletes_gauge.set(labels, table_metrics.deletes);
txns_gauge.set(labels, table_metrics.txns);
page_flushes_gauge.set(labels, table_metrics.pageFlushes);
writes_gauge.set(labels, table_metrics.writes);
pages_written_gauge.set(labels, table_metrics.pagesWritten);
time_during_txns_gauge.set(labels, table_metrics.timeDuringTxns);
time_start_txns_gauge.set(labels, table_metrics.timeStartTxns);
time_page_flushes_gauge.set(labels, table_metrics.timePageFlushes);
time_sync_gauge.set(labels, table_metrics.timeSync);
}
}
reply.type(Prometheus.register.contentType)
return await Prometheus.register.metrics();
}
});
};
Testing it Out
To make sure things are working, we can navigate to Prometheus Console at localhost:9090.
Under “Status > Target” make sure that our HarperDB endpoint is being scraped:
We can then go under “Graphs” to run some queries:
We can see some default Node.js metrics being scraped (to check the whole list, go to “localhost:9926/prometheus_exporter/metrics”).
We can also check for our custom HarperDB metrics:
Wrapping Up
In this tutorial, we saw how to install the HarperDB Prometheus Exporter via custom functions to scrape Node.js and custom HarperDB metrics. Since HarperDB exposes a bit more information via the `system_information` endpoint, we can enrich this library with other metrics that we care about.
One of the advantages of this approach is that you can leverage existing frameworks (i.e., custom functions) to deploy Prometheus Exporter. However, if you wanted to use a sidecar approach or not have metrics tied to the database layer, you can still use the previous approach to have a separate server scraping metrics as well.