Meet the HarperDB Fully Indexed Database


Yesterday, our engineering team took a quick break from coding to discuss HarperDB. While it was a shorter webinar, it was great to hear from the guys who not only built HarperDB but have also been working together for over 6 years. Stephen, our CEO, started the webinar off with a little background on HarperDB. Our CTO, Kyle, then took over to run through the basic features and functionality of HarperDB. Finally, Zach our CPO finished with some interesting future features and use cases.

 It was interesting not only to get background on how the product was born but also to hear the three guys work together live.  Five plus years of long days and sometimes long nights creates a special team dynamic.  

As Stephen stated in the intro, HarperDB originated from the growing pains the team was experiencing back at our last company while working with big data and analytics in the sports and entertainment space. On average, we were processing several billion transactions per month which required multiple databases, ETL processes, and nearly 100 servers to maintain. Infrastructure costs alone were nearly $100k/month, and 8 engineers were required to maintain our back end.  The multi-tier database architecture implemented was seen as cutting edge by peers in our industry, but to us it felt like we were putting a band-aid on the overall problem. As Kyle, Zach, and Stephen looked into this problem deeper they couldn’t locate a solution that they felt met the markets needs.  Out of this search, HarperDB was born.  

The design philosophy is simple.  Every part of the design revolves around simplicity and ease of use. We want to be the database for all, whether that be an enterprise or a single developer working on a personal project.  The goal is to take the complexity out of databases for the end user and let the team here in Denver manage that complexity so that you can focus on the work that matters for your company. 

Getting started with the HarperDB HTAP database

Kyle provided a live demo showing the features and functionality around the community edition of HarperDB. It blows my mind every time I see how quickly the database can be installed and running. HarperDB has a small footprint, with less bloat to organize and configure. The install and setup process is fast. When I think of databases I usually think of time consuming efforts from a database administrator analyzing query times and creating indexes on specific attributes all in order to achieve the same performance HarperDB obtains by nativetly being a fully indexed database as it is used. The features Kyle included: create schema, create tables, describe schema, insert, bulk loads through CSV, simple and complex queries, did not require the creation of indexes from the back end.  This is all native, performance is the same regardless of column attribute and SQL or No-SQL queries. Highlighting our natively indexed, schemaless & single model architecture. 

One of the most exciting features of HarperDB is the real time SQL and full NoSQL CRUD features and capabilities. Kyle demoed update statements including conditional updates in alignment with ANSI SQL standards, providing developers tools they already know such as REST API/SQL/JSON all in one place. He also did more complex SQL queries including joining tables multiple conditions and IN statements. The response times of the complex queries achieved the same speeds as search by hash, incredibly fast!  

He also took some time to show what HarperDB looks like on a business intelligence tool, DBviz utilizing our JDBC Driver.  With HarperDB’s JDBC and ODBC drivers you can connect HarperDB to most  popular data visualization tools like Tableau and Qlick. 

Our roadmap and looking to the future: IoT and blockchain to name a couple

Before we ended with an open Q&A with attendants of the webinar, our Chief Product Officer, Zach had a chance to speak on our road-map and interesting use cases he sees for the future. Zach has a creative vision and often comes up with practical yet forward thinking ideas for HarperDB. 

IoT is an area where HarperDB excels. With an incredibly small footprint its lightweight and agile enough to run on IoT devices. You can read our blog about installing HarperDB directly on a Raspberry Pi here. Utilizing HarperDB clustering you can distribute queries and data loads across a network of devices or replicate data back to a single vertically scaled sever for deeper analytics.   

Zachary also highlighted the coming publish/subscribe functionality, allowing developers to cut their bandwidth from polling for data.  Applications will be able to subscribe to events and then have data published to them as HarperDB is updated. HarperDB will also provide the capability to run as a serverless/headless database in the near future. Since HarperDB is written in Node.js, an interpreted language, it is straight forward to expose the Create Read Update and Delete functionality, allowing applications to require HarperDB as a module. 

Zach wrapped up the conversation by discussing Distributed ledger technology(DLT) which was the final item on our roadmap. Distributed ledger technology is a relatively new technology and many implementations are sprouting up every week.  HarperDB has plans to follow and create modules around the larger established frameworks.  Making HarperDB an interface to these established frameworks allows companies to build applications on top of HarperDB allowing the database to append data to the secure immutable ledger that the distributed ledger technology is famous for.    

Again, we really appreciate all the interactions we had at end of the webinar and if your interested in a recorded version or the slides, email us at, we are happy to share!