2018 Predictions for Big Data, Databases, & Analytics





Happy New Year from the HarperDB Team!   It’s a new year full of promise, new technology and so very much data. This year we will see maturation of technologies new and existing, and a partial departure from the cloud. Let’s peer through the shroud of time and discuss what is coming in 2018 regarding all things data. 

Serverless

Like we posted in a previous blog the promise of the cloud will become more apparent in serverless compute.  Specifically regarding databases, cloud providers are offering truly on demand access to relational databases.  This means that developers can access an RDBMS similar to hosted NoSQL services. Typical RDBMS services had been a fixed cost and even when the server was not experiencing traffic, cost was being incurred to keep the lights on for a large database instance. With these new offerings, cloud providers internalize their expertise in stability and maintenance to offer relational databases at elastic scale and on demand cost. 

Distributed Databases

As our edge gets edgier and our needs for lower latency become greater, we will need to drop out of the cloud and adopt fog computing.  Autonomous vehicles are a prime example of edge devices that cannot solely rely on the cloud.  Given their obvious mobility and the fact that we do not have consistent national and global network connectivity; autonomous vehicles will need to rely on a distributed network of databases.  When out of connectivity, the vehicle will need to store its data in a local repository and as it establishes connectivity begin syncing to edge servers that live on cell towers or in a local region thus creating an elastic mesh network of data. This requires a database that is natively aware of its peers and has built in capacity to query across the entire edge intelligently. This leads directly into the internet of data; from this living and growing mesh database we can understand in real-time what is occurring on the roads and in our vehicles and make actionable decisions with modeling, machine learning, and AI. 

Blockchain

Blockchain isn’t just the hottest crypto-currency your nephew is mining to rake in monthly fat stacks of cash.  It is a technology that relies on a consensus across multiple peers to verify transactions.  As such, it is significantly more durable and secure than many existing protocols today.  Adoption across finance, manufacturing, retail, etc. has been picking up at a rapid pace. There isn’t just THE blockchain, enterprises maintain their own private blockchains and there are different flavors and implementations. This means organizations can fragment their data similar to other data silos. To maintain a singularity across this fragmentation, databases will evolve to talk to multiple ledgers and create a meta-data layer that creates a unified schema across multiple blockchains. 

SQL, NoSQL, No Problem

NoSQL was intended to replace SQL and specifically relational databases, with more flexibility and scale.  Scale was improved and the ability to ingest high rates of data was achieved, but the capability for deep analytics was lost. Complex architectures had to be adopted to handle ingest AND reporting. Technologies like Hadoop and Spark rose to fill the gap and coalesce the data lakes and make sense out of the unstructured data, but they are complicated and not accessible to every developer.  The realization now is we have had a tool around for decades that developers know works with BI tools, and is relatively easy.  That tool is SQL. The resurgence of SQL will continue through this year as various data tools and databases that have shied away from the traditional, embrace SQL and give it new life. 

Getting our Heads Out of the Cloud

The cloud has allowed for rapid development and has empowered developers to go from idea to product with minimal cost and time. The issue is on scale hosting a database or maintaining large cloud-based servers ends up being significantly more expensive than hosting it yourself.  Add in security concerns and lack of transparency with what is happening with your data and there is a growing trend of escaping the cloud.  The hybrid cloud will continue to grow as cloud service adopters realize that strategically carving architecture into their own hosted environments saves capital and allows for a more fluid infrastructure.  

TLDR

The main takeaway from what we see this year is the cloud being more defined in how it makes sense for organizations to adopt. Choose on demand services that help simplify your architecture while owning your core infrastructure that is costly and private.  Old becomes new again with the adoption of SQL overlaying new technology. Finally, new becomes embraced with blockchain, fog computing, and distributed databases