Over the last several years multi-tiered database architectures have become increasingly popular. These architectures typically consist of multiple different database products and can be made up of NoSQL products, SQL products, In-Memory products, Object stores etc. A lot has been written about the benefits and the advantages.
They have become popular patterns due to a few different scenarios. The first and most common reason for this type of architecture is data streaming. The reason for the need for multi-tiered patterns in data steaming is that often the database is processing a huge amount of unstructured data. It needs a persistent storage layer to capture said data. This is often accomplished with a NoSQL database or Object Store. Next the data is transformed and enriched, most commonly in transit, to a SQL store or a data lake. If enrichment and transformation is attempted prior to pushing data into a NoSQL product, data can often be lost. By storing the data in a NoSQL product or Object Store, this can function as a persistent caching layer as NoSQL databases are highly flexible and scalable. This means that the transformation layer can crash and resume with persistent reliable storage from the NoSQL layer.
This pattern makes a lot of sense but presents a lot of risk. It has become so common in fact that the top NoSQL databases like MongoDB, Couchbase, and others have productized the pattern with multi-model. You can read more about the risks inherent in multi-model implementations in my blog – Multi-Model Databases a Mistake. Many of the risks are similar, but are even more exacerbated in a multi-tiered architecture.
Data integrity is the number one issue with multi-tiered database architectures. As these patterns require processes to keep them synchronized, if one of these processes fails or causes an error, it can now lead to two or more of the databases being out of sync. Obviously fault tolerance can be built into these processes, but those can fail as well. Furthermore, this leads to uncertainty when data is being analyzed. Is it up to date? Are there any errors? Imagine you are constructing an IoT data warehouse and real-time reporting is key. Are you certain that the data you are looking at is accurate? Are you sure your transformation process has not failed? Do you have time to check?
Additionally, even if you can eventually trust your data to synchronize due to very solid fault tolerance, if you are processing huge volumes of data, it can often take days or even hours to catch back up. In my previous role as a CTO we were running a similar architecture and after our RDBMS crashed it took us 3 days to sync the RDBMS with our NoSQL data store.
Because of this, our business end users lost significant trust in our reporting. Most often these architectures are mono-directional as it is very difficult to update data in both directions and keep them in sync. This means the majority of reporting is occurring down the line and if the failure occurs upstream the reporting is corrupted.
Complexity & Knowledge
These systems are pretty complex to maintain. Recently I was having lunch with a seasoned big data executive and while he was describing his architecture he said, “ I need to perform an ink test to figure out where my data is actually going.” I loved this imagery because it really sums up the problem. Multi-tiered database architectures are so complex that often no single person knows all components of the system. As a result it is nearly impossible to predict how the creation of a single record at the beginning of the process will look at the end of the process. As a result it is often necessary to perform “ink tests” just to see what is happening. This is not a great feeling, as you don’t want to be performing testing on your production data.
Furthermore, what happens when you lose a key engineer? How do you continue to maintain the system? To give you an example from my last role as CTO, Kyle Bernhardy our current CTO at HarperDB went on a 3 week vacation and we experienced an outage during that period. Before he went I thought I had a pretty good grasp of our architecture. I was wrong. I realized that while I understood many components of the solution, there were a few I did not. Our system was so complex that it took almost an entire day to diagnose the problem as there were so many moving parts and I was uncertain where the outage was even occurring. Due to the fact that we were consuming incredibly high volumes of streaming data, 250k records a second, this made it very stressful. For every second that I couldn’t figure out the problem, we were losing a quarter million records.
As mentioned above, these systems often need to be mono-directional. This is because it is challenging enough to keep them in sync in one direction. If you start making updates in the other direction then things can get even further out of whack. Additionally, it is because some form of transformation and enrichment is occurring on the system during the process. What that means is your data at the beginning of the system doesn’t look the same as the data at the end of the system. Thus, if you try and update it after its been enriched you can cause some pretty drastic corruption issues. This might not matter in some use cases, but what about for something like an IoT data warehouse where you want to use the data you are collecting for real-time reaction? It’s kind of frustrating if you can only swim downstream and not swim back upstream.
The ultimate value in data is its actionability. As a result if you have a mono-directional system you are setting yourself up for failure. You might be able to eventually analyze your data, but you’re going to have to create an entirely separate process and system for reacting to that data. You will have one system for data storage and retrieval and another system for reaction. This is expensive, inherently slow, and potentially out of sync.
Now what? This is the most widely used architecture today and while it is a widely adopted pattern, it is not the best pattern. So if there are all these risks what do you do? The answer is pretty simple. Many newer databases have adopted a Hybrid transactional/analytical processing(HTAP) model which allows for handling operational, transactional, and analytical workloads in a single system. The primary pattern of these multi-tiered database architectures is that a certain set of systems is setup for operational and transactional workloads while another system is setup for analytical. With an HTAP database you can accomplish all of these workloads in a single product eliminating the need for multiple databases, and eliminating the problems mentioned above.