When we say one of our core values here at HarperDB is transparency, we mean it. The goal of this page is to provide insight into the exciting new features and improvements we’re planning for the future of HarperDB. While the priority of these items may change, this effort at transparency is intended to help you understand where HarperDB is heading.
Feedback and feature requests are welcome: visit feedback.harperdb.io to get involved!
We’re the Border Collie of databases: intelligent, resourceful, and hardworking. We’re focused on ensuring that replacing established enterprise databases (or choosing HarperDB for new initiatives) never means leaving industry-standard functionality on the table:
- SQL Subselects: Long-awaited and much-requested, the workhorse of data science is coming to HarperDB’s SQL operation. In the meantime, HarperDB drivers support subselects, so your BI Tools can subquery in peace.
- Backup & Restore: Explicit operations to backup and restore a HarperDB instance, useful for creating clustered replicas, exporting data for analysis or testing, and disaster recovery.
- Data Sharding: Distribute your largest datasets across multiple HarperDB instances with the most efficient distribution algorithm available, affording you unlimited scale at without unnecessary cost.
HarperDB is dedicated to delivering best-in-class performance. With our upcoming 4.0 release, we’ve actually reduced the size of our codebase by 25%. Fewer moving parts means even greater stability, resiliency, and speed. And those efforts aren’t just related to the core product.
- Resource Allocation: Because HarperDB is multi-core/multi-process, the way we attack a workload is always a great candidate for optimization. We’re exploring using machine learning to make the best use of compute resources across the variety of workloads we encounter, whether they’re coming through our Operations API, Custom Functions, or our Clustering engine.
- HarperDB Cloud: We’re continually refining our architecture within our hosted infrastructure to eliminate bottlenecks, limitations, and excessive costs. This includes offering multi-cloud solutions based on network performance, and even moving workloads around the system dynamically based on instance costs.
- Infrastructure Partnerships: Making the product faster is always good, and optimizing the hardware is always worthwhile, but nothing beats proximity when it comes to a distributed system. We’ve recently launched HarperDB on Verizon’s 5G Wavelength network, delivering single-digit millisecond response times to mobile clients, and we’re continuing to find new and innovative partnerships to bring your users closer to your data.
- Custom Functions Project Repository: An open-source repo where Custom Function projects we’ve created for testing, or on behalf of clients (with their permission, of course) are readily available for you to use.
- GitHub Actions Deployment Template: You can always deploy your Custom Functions projects using HarperDB Studio, but that process doesn’t scale to hundreds or thousands of instances, so we’re developing a GitHub action to allow you to commit once, and deploy to many.
- Expansion of hdbCore: We’re going to be adding more methods to the hdbCore object that grant you access to core HarperDB operations, including the ability to spin up long-running and/or restart-proof processes, like sensor data collection from a local port.
We’re planning substantial improvements and additions to our flexible data-plane architecture:
- Distributed Queries: Provide users with the ability to execute query operations (both SQL and NoSQL) across a cluster via a single HarperDB Instance.
- Improved Pub/Sub Performance and Durability: We’re replacing our existing clustering engine with the fastest greyhound we could find: NATS.io’s new Jetstream platform, delivering durable subscriptions and a 100x throughput boost.
- Offline Studio: We’re going to upgrade the offline, instance-hosted version of HarperDB Studio to match the feature set of the cloud-hosted version available at studio.harperdb.io. We’ll also allow you to disable it entirely for maximum security.
- Clustering Visualization: Our current clustering configuration is done on an instance-by-instance basis. We’re going to add a single, comprehensive interface for managing and monitoring data flow across your network.
- CSV Export: We’re going to let you export and download the results of a specific query (or entire tables) in CSV format directly from the Studio, rather than requiring that you use our current export operation and fetch the file from the filesystem.
- HarperDB Cloud Goes International: Fire up the yacht, we’re going to Monaco 🚤! Well, maybe not, but we do intend to provide users with the ability to deploy hosted HarperDB Cloud both on AWS instances outside of the US, as well as on other cloud providers.
- Kubernetes Deployments: Templates and guidance for automated Kubernetes deployments.
- Optimized Docker Container: Performance optimizations of our existing HarperDB Docker container.