At this point, I don’t think it is necessary to convince anyone that data is becoming mission critical to organizations of all sizes. We have seen an enormous growth in big data projects over the last decade. Companies are trying to get an accurate picture of their customers across almost every vertical. Many dollars have been invested in digital transformation projects attempting to make sense of big data. That said, most companies are still struggling with how to gain both actionability and monetization of their data.
Organizations are at many different stages of their big data journey. Some at the start of their journey figuring out how to begin, others have deep intelligence into their data, but few are able to take that intelligence and make it actionable in real-time.
As we have written at length this is a pretty complex problem, and the genesis of HarperDB. We do not claim that HarperDB is the magic pill that provides a solution to this entire process. Even after consolidating the data value chain, companies will still have a lot of moving parts in their data value chain such as IoT devices, Gateways, Applications, Analytics tools, DR solutions etc.
While DevOps is a well known and popular term, DataOps is now emerging as a practice that is of equal importance. You can read the DataOps Manifesto which does an excellent job of outlining the practice. To quote directly from the text their mission is, “Through firsthand experience working with data across organizations, tools, and industries we have uncovered a better way to develop and deliver analytics that we call DataOps.”
This is a blend of data science, DevOps, business intelligence, and data engineering. The goal is to produce agile, actionable, repeatable practices within big data to allow companies to see true value from big data. This thinking is critical. As data grows exponentially year over year the infrastructure and skills sets to manage that data are becoming more complex.
15 years ago in order to gain insight into an organization’s data a simple SQL database running a reporting tool like Crystal reports was sufficient to understand most of the enterprise. Back then we were dealing with run the business data like financial reports, call center metrics, sales funnels etc. For a lot of companies those record sets measure in the hundreds of thousands, or maybe in the low digit millions. Typically these reporting needs could be serviced by a reporting analyst who often was a junior developer, or by someone needing no coding skills at all.
Today we are dealing with social media data, wifi data, IoT data, geospatial data, blockchain data, and much more. These record sets measure in the billions. Furthermore, we are doing much more complicated things with these data sets.
We are performing machine learning and predictive analytics. We are feeding AI algorithms. As a result it often requires many well trained and sophisticated resources to get the most out of these data sets. Business is becoming more fluid, and companies need to respond faster to market demands. As a result it is critical that they can analyze and model their decision making with the best data possible, as quickly as possible.
A single bad decision in the data value chain can lead to days or even week long delays in data moving from ingestion to actionability. Previously companies focused on building their run the business applications ensuring their customer channels were integrated, and once this matured they added business intelligence into the mix. Their enterprise architectures were not designed with the data value chain in mind, but rather focused on running their business. This makes sense as data was previously a one way street, and dashboards and reports were sufficient to their needs.
Today, we live in a different world. In order to ensure that organizations can remain competitive and agile, it is critical to have team members who are thinking about the entire life cycle of data and this is where DataOps enters the picture. By building competency in DataOps, companies can have groups that can work alongside their existing teams augmenting their existing capabilities preemptively.
Perhaps your organization is going to undertake a new marketing initiative. Rather then reach out to the BI team post project to analyze the ROI, what if the DataOps team could work alongside the marketing team to augment their initiative with predictive analytical capability and model the outcome of the campaign with 100 different variations before spending a single dollar? What if these could be done in a single afternoon rather than a month or a year?
The practice of DataOps combined with advances in data management solutions make this fantasy a reality. As more and more companies adopt similar practices it will be critical that their competitors follow suit to remain competitive.