Click Below to Get the Code

Browse, clone, and build from real-world templates powered by Harper.
Blog
GitHub Logo

The Risks of Multi-Tier Database Architecture

This article explores the rise of multi-tier database architectures, where applications use several different database products for streaming and other workloads. It outlines why this pattern adds operational complexity and inconsistent data that can outweigh any benefits.
Blog

The Risks of Multi-Tier Database Architecture

Stephen Goldberg
CEO & Co-Founder
at Harper
September 4, 2018
Stephen Goldberg
CEO & Co-Founder
at Harper
September 4, 2018
Stephen Goldberg
CEO & Co-Founder
at Harper
September 4, 2018
September 4, 2018
This article explores the rise of multi-tier database architectures, where applications use several different database products for streaming and other workloads. It outlines why this pattern adds operational complexity and inconsistent data that can outweigh any benefits.
Stephen Goldberg
CEO & Co-Founder

Over the last several years multi-tiered database architectures have become increasingly popular. These architectures typically consist of multiple different database products and can be made up of NoSQL products, SQL products, In-Memory products, Object stores etc.  A lot has been written about the benefits and the advantages.  

They have become popular patterns due to a few different scenarios.  The first and most common reason for this type of architecture is data streaming.  The reason for the need for multi-tiered patterns in data steaming is that often the database is processing a huge amount of unstructured data.  It needs a persistent storage layer to capture said data.  This is often accomplished with a NoSQL database or Object Store.  Next the data is transformed and enriched, most commonly in transit, to a SQL store or a data lake.  If enrichment and transformation is attempted prior to pushing data into a NoSQL product, data can often be lost.  By storing the data in a NoSQL product or Object Store, this can function as a persistent caching layer as NoSQL databases are highly flexible and scalable.  This means that the transformation layer can crash and resume with persistent reliable storage from the NoSQL layer.  

This pattern makes a lot of sense but presents a lot of risk. It has become so common in fact that the top NoSQL databases like MongoDB, Couchbase, and others have productized the pattern with multi-model.  You can read more about the risks inherent in multi-model implementations in my blog - Multi-Model Databases a Mistake.   Many of the risks are similar, but are even more exacerbated in a multi-tiered architecture.  

Data Integrity 

Data integrity is the number one issue with multi-tiered database architectures.  As these patterns require processes to keep them synchronized, if one of these processes fails or causes an error, it can now lead to two or more of the databases being out of sync.  Obviously fault tolerance can be built into these processes, but those can fail as well.  Furthermore, this leads to uncertainty when data is being analyzed.  Is it up to date? Are there any errors?  Imagine you are constructing an IoT data warehouse and real-time reporting is key.  Are you certain that the data you are looking at is accurate?  Are you sure your transformation process has not failed?  Do you have time to check?  

Additionally, even if you can eventually trust your data to synchronize due to very solid fault tolerance, if you are processing huge volumes of data, it can often take days or even hours to catch back up.  In my previous role as a CTO we were running a similar architecture and after our RDBMS crashed it took us 3 days to sync the RDBMS with our NoSQL data store.    

Because of this, our business end users lost significant trust in our reporting.  Most often these architectures are mono-directional as it is very difficult to update data in both directions and keep them in sync.  This means the majority of reporting is occurring down the line and if the failure occurs upstream the reporting is corrupted.   

Complexity & Knowledge

These systems are pretty complex to maintain.   Recently I was having lunch with a seasoned big data executive and while he was describing his architecture he said, “ I need to perform an ink test to figure out where my data is actually going.”  I loved this imagery because it really sums up the problem.  Multi-tiered database architectures are so complex that often no single person knows all components of the system.  As a result it is nearly impossible to predict how the creation of a single record at the beginning of the process will look at the end of the process.  As a result it is often necessary to perform “ink tests” just to see what is happening.  This is not a great feeling, as you don’t want to be performing testing on your production data.   

Furthermore, what happens when you lose a key engineer?  How do you continue to maintain the system?  To give you an example from my last role as CTO, Kyle Bernhardy our current CTO at Harper went on a 3 week vacation and we experienced an outage during that period.  Before he went I thought I had a pretty good grasp of our architecture.  I was wrong.  I realized that while I understood many components of the solution, there were a few I did not.  Our system was so complex that it took almost an entire day to diagnose the problem as there were so many moving parts and I was uncertain where the outage was even occurring.  Due to the fact that we were consuming incredibly high volumes of streaming data, 250k records a second, this made it very stressful.  For every second that I couldn’t figure out the problem, we were losing a quarter million records.   

Mono-directional 

As mentioned above, these systems often need to be mono-directional.  This is because it is challenging enough to keep them in sync in one direction.  If you start making updates in the other direction then things can get even further out of whack.  Additionally, it is because some form of transformation and enrichment is occurring on the system during the process.  What that means is your data at the beginning of the system doesn’t look the same as the data at the end of the system.  Thus, if you try and update it after its been enriched you can cause some pretty drastic corruption issues.  This might not matter in some use cases, but what about for something like an IoT data warehouse where you want to use the data you are collecting for real-time reaction?  It’s kind of frustrating if you can only swim downstream and not swim back upstream.   

The ultimate value in data is its actionability.  As a result if you have a mono-directional system you are setting yourself up for failure.  You might be able to eventually analyze your data, but you’re going to have to create an entirely separate process and system for reacting to that data. You will have one system for data storage and retrieval and another system for reaction.  This is expensive, inherently slow, and potentially out of sync.  

Conclusion

Now what?  This is the most widely used architecture today and while it is a widely adopted pattern, it is not the best pattern.  So if there are all these risks what do you do?  The answer is pretty simple.  Many newer databases have adopted a Hybrid transactional/analytical processing(HTAP) model which allows for handling operational, transactional, and analytical workloads in a single system.  The primary pattern of these multi-tiered database architectures is that a certain set of systems is setup for operational and transactional workloads while another system is setup for analytical.  With an HTAP database you can accomplish all of these workloads in a single product eliminating the need for multiple databases, and eliminating the problems mentioned above.   

Over the last several years multi-tiered database architectures have become increasingly popular. These architectures typically consist of multiple different database products and can be made up of NoSQL products, SQL products, In-Memory products, Object stores etc.  A lot has been written about the benefits and the advantages.  

They have become popular patterns due to a few different scenarios.  The first and most common reason for this type of architecture is data streaming.  The reason for the need for multi-tiered patterns in data steaming is that often the database is processing a huge amount of unstructured data.  It needs a persistent storage layer to capture said data.  This is often accomplished with a NoSQL database or Object Store.  Next the data is transformed and enriched, most commonly in transit, to a SQL store or a data lake.  If enrichment and transformation is attempted prior to pushing data into a NoSQL product, data can often be lost.  By storing the data in a NoSQL product or Object Store, this can function as a persistent caching layer as NoSQL databases are highly flexible and scalable.  This means that the transformation layer can crash and resume with persistent reliable storage from the NoSQL layer.  

This pattern makes a lot of sense but presents a lot of risk. It has become so common in fact that the top NoSQL databases like MongoDB, Couchbase, and others have productized the pattern with multi-model.  You can read more about the risks inherent in multi-model implementations in my blog - Multi-Model Databases a Mistake.   Many of the risks are similar, but are even more exacerbated in a multi-tiered architecture.  

Data Integrity 

Data integrity is the number one issue with multi-tiered database architectures.  As these patterns require processes to keep them synchronized, if one of these processes fails or causes an error, it can now lead to two or more of the databases being out of sync.  Obviously fault tolerance can be built into these processes, but those can fail as well.  Furthermore, this leads to uncertainty when data is being analyzed.  Is it up to date? Are there any errors?  Imagine you are constructing an IoT data warehouse and real-time reporting is key.  Are you certain that the data you are looking at is accurate?  Are you sure your transformation process has not failed?  Do you have time to check?  

Additionally, even if you can eventually trust your data to synchronize due to very solid fault tolerance, if you are processing huge volumes of data, it can often take days or even hours to catch back up.  In my previous role as a CTO we were running a similar architecture and after our RDBMS crashed it took us 3 days to sync the RDBMS with our NoSQL data store.    

Because of this, our business end users lost significant trust in our reporting.  Most often these architectures are mono-directional as it is very difficult to update data in both directions and keep them in sync.  This means the majority of reporting is occurring down the line and if the failure occurs upstream the reporting is corrupted.   

Complexity & Knowledge

These systems are pretty complex to maintain.   Recently I was having lunch with a seasoned big data executive and while he was describing his architecture he said, “ I need to perform an ink test to figure out where my data is actually going.”  I loved this imagery because it really sums up the problem.  Multi-tiered database architectures are so complex that often no single person knows all components of the system.  As a result it is nearly impossible to predict how the creation of a single record at the beginning of the process will look at the end of the process.  As a result it is often necessary to perform “ink tests” just to see what is happening.  This is not a great feeling, as you don’t want to be performing testing on your production data.   

Furthermore, what happens when you lose a key engineer?  How do you continue to maintain the system?  To give you an example from my last role as CTO, Kyle Bernhardy our current CTO at Harper went on a 3 week vacation and we experienced an outage during that period.  Before he went I thought I had a pretty good grasp of our architecture.  I was wrong.  I realized that while I understood many components of the solution, there were a few I did not.  Our system was so complex that it took almost an entire day to diagnose the problem as there were so many moving parts and I was uncertain where the outage was even occurring.  Due to the fact that we were consuming incredibly high volumes of streaming data, 250k records a second, this made it very stressful.  For every second that I couldn’t figure out the problem, we were losing a quarter million records.   

Mono-directional 

As mentioned above, these systems often need to be mono-directional.  This is because it is challenging enough to keep them in sync in one direction.  If you start making updates in the other direction then things can get even further out of whack.  Additionally, it is because some form of transformation and enrichment is occurring on the system during the process.  What that means is your data at the beginning of the system doesn’t look the same as the data at the end of the system.  Thus, if you try and update it after its been enriched you can cause some pretty drastic corruption issues.  This might not matter in some use cases, but what about for something like an IoT data warehouse where you want to use the data you are collecting for real-time reaction?  It’s kind of frustrating if you can only swim downstream and not swim back upstream.   

The ultimate value in data is its actionability.  As a result if you have a mono-directional system you are setting yourself up for failure.  You might be able to eventually analyze your data, but you’re going to have to create an entirely separate process and system for reacting to that data. You will have one system for data storage and retrieval and another system for reaction.  This is expensive, inherently slow, and potentially out of sync.  

Conclusion

Now what?  This is the most widely used architecture today and while it is a widely adopted pattern, it is not the best pattern.  So if there are all these risks what do you do?  The answer is pretty simple.  Many newer databases have adopted a Hybrid transactional/analytical processing(HTAP) model which allows for handling operational, transactional, and analytical workloads in a single system.  The primary pattern of these multi-tiered database architectures is that a certain set of systems is setup for operational and transactional workloads while another system is setup for analytical.  With an HTAP database you can accomplish all of these workloads in a single product eliminating the need for multiple databases, and eliminating the problems mentioned above.   

This article explores the rise of multi-tier database architectures, where applications use several different database products for streaming and other workloads. It outlines why this pattern adds operational complexity and inconsistent data that can outweigh any benefits.

Download

White arrow pointing right
This article explores the rise of multi-tier database architectures, where applications use several different database products for streaming and other workloads. It outlines why this pattern adds operational complexity and inconsistent data that can outweigh any benefits.

Download

White arrow pointing right
This article explores the rise of multi-tier database architectures, where applications use several different database products for streaming and other workloads. It outlines why this pattern adds operational complexity and inconsistent data that can outweigh any benefits.

Download

White arrow pointing right

Explore Recent Resources

Tutorial
GitHub Logo

Introducing Structon: Random-Access Binary Encoding for JavaScript

Deserializing entire records to read one field is a bottleneck at scale. Structon stores objects in a binary format where any field is reachable by byte offset, with lazy getters that never allocate until you access a property. It's the encoding Harper has used internally for years, now a standalone package.
JavaScript
Tutorial
Deserializing entire records to read one field is a bottleneck at scale. Structon stores objects in a binary format where any field is reachable by byte offset, with lazy getters that never allocate until you access a property. It's the encoding Harper has used internally for years, now a standalone package.
Person with very short blonde hair wearing a light gray button‑up shirt, standing with arms crossed and smiling outdoors with foliage behind.
Kris Zyp
SVP of Engineering
Tutorial

Introducing Structon: Random-Access Binary Encoding for JavaScript

Deserializing entire records to read one field is a bottleneck at scale. Structon stores objects in a binary format where any field is reachable by byte offset, with lazy getters that never allocate until you access a property. It's the encoding Harper has used internally for years, now a standalone package.
Kris Zyp
Jun 2026
Tutorial

Introducing Structon: Random-Access Binary Encoding for JavaScript

Deserializing entire records to read one field is a bottleneck at scale. Structon stores objects in a binary format where any field is reachable by byte offset, with lazy getters that never allocate until you access a property. It's the encoding Harper has used internally for years, now a standalone package.
Kris Zyp
Tutorial

Introducing Structon: Random-Access Binary Encoding for JavaScript

Deserializing entire records to read one field is a bottleneck at scale. Structon stores objects in a binary format where any field is reachable by byte offset, with lazy getters that never allocate until you access a property. It's the encoding Harper has used internally for years, now a standalone package.
Kris Zyp
Livestream
GitHub Logo

1.5 Hour Build - Vibe Coding a Full Personal Site: Design to Deployment in One Session

Watch Austin rebuild his personal website live using Claude AI and Harper, including a custom Markdown CMS, GraphQL schema design, React scaffolding, and full deployment. A real-time pair coding session from design to launch.
Livestream
Watch Austin rebuild his personal website live using Claude AI and Harper, including a custom Markdown CMS, GraphQL schema design, React scaffolding, and full deployment. A real-time pair coding session from design to launch.
Person with short hair wearing a light blue patterned shirt, smiling widely outdoors with blurred greenery and trees in the background.
Austin Akers
Head of Developer Relations
Livestream

1.5 Hour Build - Vibe Coding a Full Personal Site: Design to Deployment in One Session

Watch Austin rebuild his personal website live using Claude AI and Harper, including a custom Markdown CMS, GraphQL schema design, React scaffolding, and full deployment. A real-time pair coding session from design to launch.
Austin Akers
May 2026
Livestream

1.5 Hour Build - Vibe Coding a Full Personal Site: Design to Deployment in One Session

Watch Austin rebuild his personal website live using Claude AI and Harper, including a custom Markdown CMS, GraphQL schema design, React scaffolding, and full deployment. A real-time pair coding session from design to launch.
Austin Akers
Livestream

1.5 Hour Build - Vibe Coding a Full Personal Site: Design to Deployment in One Session

Watch Austin rebuild his personal website live using Claude AI and Harper, including a custom Markdown CMS, GraphQL schema design, React scaffolding, and full deployment. A real-time pair coding session from design to launch.
Austin Akers
Blog
GitHub Logo

The Old Product Loop Is the New Bottleneck

AI has made it dramatically cheaper to get software to a working version, but most companies still plan like building is the expensive part. The new bottleneck is the product loop: forming sharp hypotheses, living inside the user experience, fixing friction as it appears, and feeding evidence back into the roadmap faster than ticket-based planning allows.
Blog
AI has made it dramatically cheaper to get software to a working version, but most companies still plan like building is the expensive part. The new bottleneck is the product loop: forming sharp hypotheses, living inside the user experience, fixing friction as it appears, and feeding evidence back into the roadmap faster than ticket-based planning allows.
Person with short dark hair and moustache, wearing a colorful plaid shirt, smiling outdoors in a forested mountain landscape.
Aleks Haugom
Senior Manager of GTM
Blog

The Old Product Loop Is the New Bottleneck

AI has made it dramatically cheaper to get software to a working version, but most companies still plan like building is the expensive part. The new bottleneck is the product loop: forming sharp hypotheses, living inside the user experience, fixing friction as it appears, and feeding evidence back into the roadmap faster than ticket-based planning allows.
Aleks Haugom
May 2026
Blog

The Old Product Loop Is the New Bottleneck

AI has made it dramatically cheaper to get software to a working version, but most companies still plan like building is the expensive part. The new bottleneck is the product loop: forming sharp hypotheses, living inside the user experience, fixing friction as it appears, and feeding evidence back into the roadmap faster than ticket-based planning allows.
Aleks Haugom
Blog

The Old Product Loop Is the New Bottleneck

AI has made it dramatically cheaper to get software to a working version, but most companies still plan like building is the expensive part. The new bottleneck is the product loop: forming sharp hypotheses, living inside the user experience, fixing friction as it appears, and feeding evidence back into the roadmap faster than ticket-based planning allows.
Aleks Haugom
Livestream
GitHub Logo

2 Hour Build - Live Stream for Non-Developers

A non-developer's live stream walkthrough of building Flow State, a Colorado river-flow app for rafters, in two hours using ChatGPT dictation, Claude Code, Claude Design, and Harper. Scaffold with npm create harper@latest and deploy to Harper Fabric. No coding background required.
Livestream
A non-developer's live stream walkthrough of building Flow State, a Colorado river-flow app for rafters, in two hours using ChatGPT dictation, Claude Code, Claude Design, and Harper. Scaffold with npm create harper@latest and deploy to Harper Fabric. No coding background required.
Person with short dark hair and moustache, wearing a colorful plaid shirt, smiling outdoors in a forested mountain landscape.
Aleks Haugom
Senior Manager of GTM
Livestream

2 Hour Build - Live Stream for Non-Developers

A non-developer's live stream walkthrough of building Flow State, a Colorado river-flow app for rafters, in two hours using ChatGPT dictation, Claude Code, Claude Design, and Harper. Scaffold with npm create harper@latest and deploy to Harper Fabric. No coding background required.
Aleks Haugom
May 2026
Livestream

2 Hour Build - Live Stream for Non-Developers

A non-developer's live stream walkthrough of building Flow State, a Colorado river-flow app for rafters, in two hours using ChatGPT dictation, Claude Code, Claude Design, and Harper. Scaffold with npm create harper@latest and deploy to Harper Fabric. No coding background required.
Aleks Haugom
Livestream

2 Hour Build - Live Stream for Non-Developers

A non-developer's live stream walkthrough of building Flow State, a Colorado river-flow app for rafters, in two hours using ChatGPT dictation, Claude Code, Claude Design, and Harper. Scaffold with npm create harper@latest and deploy to Harper Fabric. No coding background required.
Aleks Haugom