Click Below to Get the Code

Browse, clone, and build from real-world templates powered by Harper.
Blog
GitHub Logo

The Risks of Multi-Tier Database Architecture

This article explores the rise of multi-tier database architectures, where applications use several different database products for streaming and other workloads. It outlines why this pattern adds operational complexity and inconsistent data that can outweigh any benefits.
Blog

The Risks of Multi-Tier Database Architecture

By
Stephen Goldberg
September 4, 2018
By
Stephen Goldberg
September 4, 2018
By
Stephen Goldberg
September 4, 2018
September 4, 2018
This article explores the rise of multi-tier database architectures, where applications use several different database products for streaming and other workloads. It outlines why this pattern adds operational complexity and inconsistent data that can outweigh any benefits.
Stephen Goldberg
CEO & Co-Founder

Over the last several years multi-tiered database architectures have become increasingly popular. These architectures typically consist of multiple different database products and can be made up of NoSQL products, SQL products, In-Memory products, Object stores etc.  A lot has been written about the benefits and the advantages.  

They have become popular patterns due to a few different scenarios.  The first and most common reason for this type of architecture is data streaming.  The reason for the need for multi-tiered patterns in data steaming is that often the database is processing a huge amount of unstructured data.  It needs a persistent storage layer to capture said data.  This is often accomplished with a NoSQL database or Object Store.  Next the data is transformed and enriched, most commonly in transit, to a SQL store or a data lake.  If enrichment and transformation is attempted prior to pushing data into a NoSQL product, data can often be lost.  By storing the data in a NoSQL product or Object Store, this can function as a persistent caching layer as NoSQL databases are highly flexible and scalable.  This means that the transformation layer can crash and resume with persistent reliable storage from the NoSQL layer.  

This pattern makes a lot of sense but presents a lot of risk. It has become so common in fact that the top NoSQL databases like MongoDB, Couchbase, and others have productized the pattern with multi-model.  You can read more about the risks inherent in multi-model implementations in my blog - Multi-Model Databases a Mistake.   Many of the risks are similar, but are even more exacerbated in a multi-tiered architecture.  

Data Integrity 

Data integrity is the number one issue with multi-tiered database architectures.  As these patterns require processes to keep them synchronized, if one of these processes fails or causes an error, it can now lead to two or more of the databases being out of sync.  Obviously fault tolerance can be built into these processes, but those can fail as well.  Furthermore, this leads to uncertainty when data is being analyzed.  Is it up to date? Are there any errors?  Imagine you are constructing an IoT data warehouse and real-time reporting is key.  Are you certain that the data you are looking at is accurate?  Are you sure your transformation process has not failed?  Do you have time to check?  

Additionally, even if you can eventually trust your data to synchronize due to very solid fault tolerance, if you are processing huge volumes of data, it can often take days or even hours to catch back up.  In my previous role as a CTO we were running a similar architecture and after our RDBMS crashed it took us 3 days to sync the RDBMS with our NoSQL data store.    

Because of this, our business end users lost significant trust in our reporting.  Most often these architectures are mono-directional as it is very difficult to update data in both directions and keep them in sync.  This means the majority of reporting is occurring down the line and if the failure occurs upstream the reporting is corrupted.   

Complexity & Knowledge

These systems are pretty complex to maintain.   Recently I was having lunch with a seasoned big data executive and while he was describing his architecture he said, “ I need to perform an ink test to figure out where my data is actually going.”  I loved this imagery because it really sums up the problem.  Multi-tiered database architectures are so complex that often no single person knows all components of the system.  As a result it is nearly impossible to predict how the creation of a single record at the beginning of the process will look at the end of the process.  As a result it is often necessary to perform “ink tests” just to see what is happening.  This is not a great feeling, as you don’t want to be performing testing on your production data.   

Furthermore, what happens when you lose a key engineer?  How do you continue to maintain the system?  To give you an example from my last role as CTO, Kyle Bernhardy our current CTO at Harper went on a 3 week vacation and we experienced an outage during that period.  Before he went I thought I had a pretty good grasp of our architecture.  I was wrong.  I realized that while I understood many components of the solution, there were a few I did not.  Our system was so complex that it took almost an entire day to diagnose the problem as there were so many moving parts and I was uncertain where the outage was even occurring.  Due to the fact that we were consuming incredibly high volumes of streaming data, 250k records a second, this made it very stressful.  For every second that I couldn’t figure out the problem, we were losing a quarter million records.   

Mono-directional 

As mentioned above, these systems often need to be mono-directional.  This is because it is challenging enough to keep them in sync in one direction.  If you start making updates in the other direction then things can get even further out of whack.  Additionally, it is because some form of transformation and enrichment is occurring on the system during the process.  What that means is your data at the beginning of the system doesn’t look the same as the data at the end of the system.  Thus, if you try and update it after its been enriched you can cause some pretty drastic corruption issues.  This might not matter in some use cases, but what about for something like an IoT data warehouse where you want to use the data you are collecting for real-time reaction?  It’s kind of frustrating if you can only swim downstream and not swim back upstream.   

The ultimate value in data is its actionability.  As a result if you have a mono-directional system you are setting yourself up for failure.  You might be able to eventually analyze your data, but you’re going to have to create an entirely separate process and system for reacting to that data. You will have one system for data storage and retrieval and another system for reaction.  This is expensive, inherently slow, and potentially out of sync.  

Conclusion

Now what?  This is the most widely used architecture today and while it is a widely adopted pattern, it is not the best pattern.  So if there are all these risks what do you do?  The answer is pretty simple.  Many newer databases have adopted a Hybrid transactional/analytical processing(HTAP) model which allows for handling operational, transactional, and analytical workloads in a single system.  The primary pattern of these multi-tiered database architectures is that a certain set of systems is setup for operational and transactional workloads while another system is setup for analytical.  With an HTAP database you can accomplish all of these workloads in a single product eliminating the need for multiple databases, and eliminating the problems mentioned above.   

Over the last several years multi-tiered database architectures have become increasingly popular. These architectures typically consist of multiple different database products and can be made up of NoSQL products, SQL products, In-Memory products, Object stores etc.  A lot has been written about the benefits and the advantages.  

They have become popular patterns due to a few different scenarios.  The first and most common reason for this type of architecture is data streaming.  The reason for the need for multi-tiered patterns in data steaming is that often the database is processing a huge amount of unstructured data.  It needs a persistent storage layer to capture said data.  This is often accomplished with a NoSQL database or Object Store.  Next the data is transformed and enriched, most commonly in transit, to a SQL store or a data lake.  If enrichment and transformation is attempted prior to pushing data into a NoSQL product, data can often be lost.  By storing the data in a NoSQL product or Object Store, this can function as a persistent caching layer as NoSQL databases are highly flexible and scalable.  This means that the transformation layer can crash and resume with persistent reliable storage from the NoSQL layer.  

This pattern makes a lot of sense but presents a lot of risk. It has become so common in fact that the top NoSQL databases like MongoDB, Couchbase, and others have productized the pattern with multi-model.  You can read more about the risks inherent in multi-model implementations in my blog - Multi-Model Databases a Mistake.   Many of the risks are similar, but are even more exacerbated in a multi-tiered architecture.  

Data Integrity 

Data integrity is the number one issue with multi-tiered database architectures.  As these patterns require processes to keep them synchronized, if one of these processes fails or causes an error, it can now lead to two or more of the databases being out of sync.  Obviously fault tolerance can be built into these processes, but those can fail as well.  Furthermore, this leads to uncertainty when data is being analyzed.  Is it up to date? Are there any errors?  Imagine you are constructing an IoT data warehouse and real-time reporting is key.  Are you certain that the data you are looking at is accurate?  Are you sure your transformation process has not failed?  Do you have time to check?  

Additionally, even if you can eventually trust your data to synchronize due to very solid fault tolerance, if you are processing huge volumes of data, it can often take days or even hours to catch back up.  In my previous role as a CTO we were running a similar architecture and after our RDBMS crashed it took us 3 days to sync the RDBMS with our NoSQL data store.    

Because of this, our business end users lost significant trust in our reporting.  Most often these architectures are mono-directional as it is very difficult to update data in both directions and keep them in sync.  This means the majority of reporting is occurring down the line and if the failure occurs upstream the reporting is corrupted.   

Complexity & Knowledge

These systems are pretty complex to maintain.   Recently I was having lunch with a seasoned big data executive and while he was describing his architecture he said, “ I need to perform an ink test to figure out where my data is actually going.”  I loved this imagery because it really sums up the problem.  Multi-tiered database architectures are so complex that often no single person knows all components of the system.  As a result it is nearly impossible to predict how the creation of a single record at the beginning of the process will look at the end of the process.  As a result it is often necessary to perform “ink tests” just to see what is happening.  This is not a great feeling, as you don’t want to be performing testing on your production data.   

Furthermore, what happens when you lose a key engineer?  How do you continue to maintain the system?  To give you an example from my last role as CTO, Kyle Bernhardy our current CTO at Harper went on a 3 week vacation and we experienced an outage during that period.  Before he went I thought I had a pretty good grasp of our architecture.  I was wrong.  I realized that while I understood many components of the solution, there were a few I did not.  Our system was so complex that it took almost an entire day to diagnose the problem as there were so many moving parts and I was uncertain where the outage was even occurring.  Due to the fact that we were consuming incredibly high volumes of streaming data, 250k records a second, this made it very stressful.  For every second that I couldn’t figure out the problem, we were losing a quarter million records.   

Mono-directional 

As mentioned above, these systems often need to be mono-directional.  This is because it is challenging enough to keep them in sync in one direction.  If you start making updates in the other direction then things can get even further out of whack.  Additionally, it is because some form of transformation and enrichment is occurring on the system during the process.  What that means is your data at the beginning of the system doesn’t look the same as the data at the end of the system.  Thus, if you try and update it after its been enriched you can cause some pretty drastic corruption issues.  This might not matter in some use cases, but what about for something like an IoT data warehouse where you want to use the data you are collecting for real-time reaction?  It’s kind of frustrating if you can only swim downstream and not swim back upstream.   

The ultimate value in data is its actionability.  As a result if you have a mono-directional system you are setting yourself up for failure.  You might be able to eventually analyze your data, but you’re going to have to create an entirely separate process and system for reacting to that data. You will have one system for data storage and retrieval and another system for reaction.  This is expensive, inherently slow, and potentially out of sync.  

Conclusion

Now what?  This is the most widely used architecture today and while it is a widely adopted pattern, it is not the best pattern.  So if there are all these risks what do you do?  The answer is pretty simple.  Many newer databases have adopted a Hybrid transactional/analytical processing(HTAP) model which allows for handling operational, transactional, and analytical workloads in a single system.  The primary pattern of these multi-tiered database architectures is that a certain set of systems is setup for operational and transactional workloads while another system is setup for analytical.  With an HTAP database you can accomplish all of these workloads in a single product eliminating the need for multiple databases, and eliminating the problems mentioned above.   

This article explores the rise of multi-tier database architectures, where applications use several different database products for streaming and other workloads. It outlines why this pattern adds operational complexity and inconsistent data that can outweigh any benefits.

Download

White arrow pointing right
This article explores the rise of multi-tier database architectures, where applications use several different database products for streaming and other workloads. It outlines why this pattern adds operational complexity and inconsistent data that can outweigh any benefits.

Download

White arrow pointing right
This article explores the rise of multi-tier database architectures, where applications use several different database products for streaming and other workloads. It outlines why this pattern adds operational complexity and inconsistent data that can outweigh any benefits.

Download

White arrow pointing right

Explore Recent Resources

Blog
GitHub Logo

Happy Thanksgiving! Here is an AI-Coded Harper Game for Your Day Off

Discover how Harper’s unified application platform and AI-first development tools make it possible for anyone—even non-developers—to build and deploy real apps. In this Thanksgiving story, follow the journey of creating a fun Pac-Man-style game using Google’s Antigravity IDE, Gemini, Claude, and Harper’s open-source templates. Learn how Harper simplifies backend development, accelerates AI-driven coding, and unlocks creativity with seamless deployment on Harper Fabric. Play the game and experience the power of Harper for modern app development.
Blog
Discover how Harper’s unified application platform and AI-first development tools make it possible for anyone—even non-developers—to build and deploy real apps. In this Thanksgiving story, follow the journey of creating a fun Pac-Man-style game using Google’s Antigravity IDE, Gemini, Claude, and Harper’s open-source templates. Learn how Harper simplifies backend development, accelerates AI-driven coding, and unlocks creativity with seamless deployment on Harper Fabric. Play the game and experience the power of Harper for modern app development.
Person with short dark hair and moustache, wearing a colorful plaid shirt, smiling outdoors in a forested mountain landscape.
Aleks Haugom
Senior Manager of GTM & Marketing
Blog

Happy Thanksgiving! Here is an AI-Coded Harper Game for Your Day Off

Discover how Harper’s unified application platform and AI-first development tools make it possible for anyone—even non-developers—to build and deploy real apps. In this Thanksgiving story, follow the journey of creating a fun Pac-Man-style game using Google’s Antigravity IDE, Gemini, Claude, and Harper’s open-source templates. Learn how Harper simplifies backend development, accelerates AI-driven coding, and unlocks creativity with seamless deployment on Harper Fabric. Play the game and experience the power of Harper for modern app development.
Aleks Haugom
Nov 2025
Blog

Happy Thanksgiving! Here is an AI-Coded Harper Game for Your Day Off

Discover how Harper’s unified application platform and AI-first development tools make it possible for anyone—even non-developers—to build and deploy real apps. In this Thanksgiving story, follow the journey of creating a fun Pac-Man-style game using Google’s Antigravity IDE, Gemini, Claude, and Harper’s open-source templates. Learn how Harper simplifies backend development, accelerates AI-driven coding, and unlocks creativity with seamless deployment on Harper Fabric. Play the game and experience the power of Harper for modern app development.
Aleks Haugom
Blog

Happy Thanksgiving! Here is an AI-Coded Harper Game for Your Day Off

Discover how Harper’s unified application platform and AI-first development tools make it possible for anyone—even non-developers—to build and deploy real apps. In this Thanksgiving story, follow the journey of creating a fun Pac-Man-style game using Google’s Antigravity IDE, Gemini, Claude, and Harper’s open-source templates. Learn how Harper simplifies backend development, accelerates AI-driven coding, and unlocks creativity with seamless deployment on Harper Fabric. Play the game and experience the power of Harper for modern app development.
Aleks Haugom
Blog
GitHub Logo

Pub/Sub for AI: The New Requirements for Real-Time Data

Harper’s unified pub/sub architecture delivers real-time data, low-latency replication, and multi-protocol streaming for AI and edge applications. Learn how database-native MQTT, WebSockets, and SSE replace legacy brokers and pipelines, enabling millisecond decisions, resilient edge deployments, and globally consistent state for next-generation intelligent systems.
A.I.
Blog
Harper’s unified pub/sub architecture delivers real-time data, low-latency replication, and multi-protocol streaming for AI and edge applications. Learn how database-native MQTT, WebSockets, and SSE replace legacy brokers and pipelines, enabling millisecond decisions, resilient edge deployments, and globally consistent state for next-generation intelligent systems.
A man with short dark hair, glasses, and a goatee smiles slightly, wearing a black shirt in front of a nature background.
Ivan R. Judson, Ph.D.
Distinguished Solution Architect
Blog

Pub/Sub for AI: The New Requirements for Real-Time Data

Harper’s unified pub/sub architecture delivers real-time data, low-latency replication, and multi-protocol streaming for AI and edge applications. Learn how database-native MQTT, WebSockets, and SSE replace legacy brokers and pipelines, enabling millisecond decisions, resilient edge deployments, and globally consistent state for next-generation intelligent systems.
Ivan R. Judson, Ph.D.
Nov 2025
Blog

Pub/Sub for AI: The New Requirements for Real-Time Data

Harper’s unified pub/sub architecture delivers real-time data, low-latency replication, and multi-protocol streaming for AI and edge applications. Learn how database-native MQTT, WebSockets, and SSE replace legacy brokers and pipelines, enabling millisecond decisions, resilient edge deployments, and globally consistent state for next-generation intelligent systems.
Ivan R. Judson, Ph.D.
Blog

Pub/Sub for AI: The New Requirements for Real-Time Data

Harper’s unified pub/sub architecture delivers real-time data, low-latency replication, and multi-protocol streaming for AI and edge applications. Learn how database-native MQTT, WebSockets, and SSE replace legacy brokers and pipelines, enabling millisecond decisions, resilient edge deployments, and globally consistent state for next-generation intelligent systems.
Ivan R. Judson, Ph.D.
Blog
GitHub Logo

Deliver Performance and Simplicity with Distributed Microliths

Distributed microliths unify data, logic, and execution into one high-performance runtime, eliminating microservice latency and complexity. By replicating a single coherent process across regions, they deliver sub-millisecond responses, active-active resilience, and edge-level speed. Platforms like Harper prove this model reduces infrastructure, simplifies operations, and scales globally with ease.
System Design
Blog
Distributed microliths unify data, logic, and execution into one high-performance runtime, eliminating microservice latency and complexity. By replicating a single coherent process across regions, they deliver sub-millisecond responses, active-active resilience, and edge-level speed. Platforms like Harper prove this model reduces infrastructure, simplifies operations, and scales globally with ease.
A man with short dark hair, glasses, and a goatee smiles slightly, wearing a black shirt in front of a nature background.
Ivan R. Judson, Ph.D.
Distinguished Solution Architect
Blog

Deliver Performance and Simplicity with Distributed Microliths

Distributed microliths unify data, logic, and execution into one high-performance runtime, eliminating microservice latency and complexity. By replicating a single coherent process across regions, they deliver sub-millisecond responses, active-active resilience, and edge-level speed. Platforms like Harper prove this model reduces infrastructure, simplifies operations, and scales globally with ease.
Ivan R. Judson, Ph.D.
Nov 2025
Blog

Deliver Performance and Simplicity with Distributed Microliths

Distributed microliths unify data, logic, and execution into one high-performance runtime, eliminating microservice latency and complexity. By replicating a single coherent process across regions, they deliver sub-millisecond responses, active-active resilience, and edge-level speed. Platforms like Harper prove this model reduces infrastructure, simplifies operations, and scales globally with ease.
Ivan R. Judson, Ph.D.
Blog

Deliver Performance and Simplicity with Distributed Microliths

Distributed microliths unify data, logic, and execution into one high-performance runtime, eliminating microservice latency and complexity. By replicating a single coherent process across regions, they deliver sub-millisecond responses, active-active resilience, and edge-level speed. Platforms like Harper prove this model reduces infrastructure, simplifies operations, and scales globally with ease.
Ivan R. Judson, Ph.D.