Click Below to Get the Code

Browse, clone, and build from real-world templates powered by Harper.
Comparison
GitHub Logo

Harper vs. Standard Microservices: Performance Comparison Benchmark

A detailed performance benchmark comparing a traditional microservices architecture with Harper’s unified runtime. Using a real, fully functional e-commerce application, this report examines latency, scalability, and architectural overhead across homepage, category, and product pages, highlighting the real-world performance implications between two different styles of distributed systems.
Comparison

Harper vs. Standard Microservices: Performance Comparison Benchmark

Aleks Haugom
Senior Manager of GTM
at Harper
December 22, 2025
Aleks Haugom
Senior Manager of GTM
at Harper
December 22, 2025
Aleks Haugom
Senior Manager of GTM
at Harper
December 22, 2025
December 22, 2025
A detailed performance benchmark comparing a traditional microservices architecture with Harper’s unified runtime. Using a real, fully functional e-commerce application, this report examines latency, scalability, and architectural overhead across homepage, category, and product pages, highlighting the real-world performance implications between two different styles of distributed systems.
Aleks Haugom
Senior Manager of GTM
Explore the complete source code and k6 benchmarks used to test both architectures side-by-side: https://github.com/HarperFast/harper-vs-microservices-perf-test 

Executive Summary

This benchmark compares two functionally equivalent e-commerce applications built with different architectural approaches: a Unified Harper Runtime and a Standard Microservices Stack. Both implementations serve identical data models, UI components, and user workflows, differing only in their underlying architecture. Testing was conducted using k6 across three page types—Homepage, Product Listing Page (PLP), and Product Detail Page (PDP)—at concurrency levels of 20, 200, and 2,000 virtual users (VUs).

Key findings: Harper demonstrated sub-millisecond p50 latency (0.73ms) on the homepage at 2,000 VUs, while maintaining 100% success rates across all tests. The microservices stack exhibited catastrophic failure at 200+ VUs for PLP and PDP scenarios, reaching 100% error rates due to timeout and connection exhaustion. Even at low concurrency (20 VUs), the microservices architecture showed 3-10x higher latency due to network serialization overhead between services.

It's important to note that these tests were conducted on localhost, representing a best-case scenario for microservices. In production environments with physical network infrastructure, routers, switches, and inter-datacenter links, the performance gap would be significantly larger.

Note: Harper is designed as a distributed system. Each node can process requests end-to-end on its own. Distribution comes from running multiple nodes across regions, ensuring users always have low-latency access to data. Harper automatically handles the complexity of keeping all nodes in sync.

What This Repo Contains

  • Two functionally equivalent e-commerce applications: One built on Harper's unified runtime, one on a standard microservices architecture
  • Interactive UI with architecture toggle: Allows visual confirmation of functional parity between implementations
  • Comprehensive k6 benchmark suite: Automated testing across multiple page types and concurrency levels
  • Complete test result artifacts: JSON outputs from all benchmark runs (18 test scenarios)
  • This detailed write-up: Analysis, methodology, and interpretation of results

Architecture Overview

Microservices Stack

The standard microservices implementation follows conventional distributed architecture patterns:

  • Four independent services: Catalog, Inventory, Pricing, and Reviews services, each running as separate Fastify processes
  • Backend-for-Frontend (BFF) Gateway: Orchestrates requests across services, aggregating data from multiple sources
  • Separate MongoDB collections: Each service maintains its own database collection, enforcing data isolation
  • Network hops on every request: Even on localhost, each page load requires HTTP calls from the BFF to 2-4 backend services, with JSON serialization/deserialization at each boundary
  • Connection pooling overhead: Each service maintains its own database connection pool and HTTP client pool

Harper Unified Runtime

Harper's architecture eliminates distributed system overhead through unification:

  • Single runtime process: Database, application logic, and caching coexist in the same memory space
  • Direct memory access: Application code interacts with data via in-process references, not network calls
  • Zero internal serialization: Data flows between layers without JSON encoding/decoding
  • Efficient aggregation: Complex queries that join across entities (products + inventory + pricing + reviews) execute as single operations without network round-trips
  • Unified resource management: Single connection pool, single event loop, single memory heap

Why This Comparison Is Valid

  • Identical data model: Both stacks use the same product catalog (~1,200 items), with identical relationships between products, inventory, pricing, and reviews
  • Identical API contracts: Both expose the same REST endpoints with identical JSON response structures
  • Identical UI: Both serve the same static frontend assets and page templates
  • Same hardware: All tests run on the same Apple M1 Max machine with 32GB RAM
  • Only architecture differs: The sole variable is how the application logic, data persistence, and aggregation are organized—unified vs. distributed

Test Environment

Hardware:

  • Apple M1 Max (10-core CPU)
  • 32GB RAM
  • macOS

Network Context: All tests were conducted on localhost. This represents the best-case scenario for microservices, as it eliminates:

  • Physical network latency (typically 0.1-1ms within a datacenter)
  • Network switch/router overhead
  • Packet loss and retransmission
  • Network congestion
  • Cross-availability-zone or cross-region latency

In production environments, microservices would face additional overhead from physical network infrastructure, load balancers, service meshes, and potential cross-datacenter communication. The performance gaps observed in this localhost benchmark would be significantly larger in real-world deployments.

Methodology

Load Testing Tool: k6 (open-source load testing framework)

Concurrency Levels: Tests were executed at three concurrency levels:

  • 20 VUs: Low load, representing typical daytime traffic
  • 200 VUs: Medium load, simulating peak shopping hours
  • 2,000 VUs: High load, stress-testing architectural limits

Metrics Collected:

  • p50 (median) latency: The latency at which 50% of requests complete
  • p95 latency: The latency at which 95% of requests complete (captures tail latency)
  • Throughput: Requests per second
  • Error rate: Percentage of failed requests (timeouts, connection failures, 5xx errors)

Test Scenarios:

  • Homepage: Displays category tiles and hero content
  • PLP (Product Listing Page): Shows 20-50 products per category with basic details
  • PDP (Product Detail Page): Aggregates product data, inventory, pricing, and reviews

Note on Errors: High error rates in the microservices stack at 200+ VUs indicate saturation—the architecture reached its coordination limits, resulting in timeouts and connection pool exhaustion. This is not a code bug but an architectural ceiling.

Results

Highlights

6.1 Summary Table: p50 Latency by Page Type and Concurrency

Page Type Concurrency
(VUs)
Microservices
Stack
Harper Unified
Stack
Performance
Gap
Homepage 20 7.25 ms 2.94 ms -59.4%
200 7.07 ms 0.88 ms -87.6%
2,000 2.31 ms 0.73 ms -68.4%
PLP 20 40,740.46 ms* 5.04 ms -99.99%
200 (100% Error) 3.13 ms Harper: 100%
Success
2,000 (100% Error) 244.69 ms Harper: 100%
Success
PDP 20 8.62 ms* 3.12 ms -63.9%
200 (100% Error) 1.39 ms Harper: 100%
Success
2,000 (100% Error) 2.01 ms Harper: 100%
Success

*Microservices experienced partial failures (3.3-35.3% error rates) even at 20 VUs for PLP and PDP.

6.2 Summary Table: p95 Latency by Page Type and Concurrency

Page Type Concurrency
(VUs)
Microservices
Stack
Harper Unified
Stack
Performance
Gap
Homepage 20 10.21 ms 6.55 ms -35.8%
200 15.37 ms 7.04 ms -54.2%
2,000 45.45 ms 15.03 ms -66.9%
PLP 20 59,997.80 ms* 15.68 ms -99.97%
200 (100% Error) 44.98 ms Harper: 100%
Success
2,000 (100% Error) 845.63 ms Harper: 100%
Success
PDP 20 20.43 ms* 7.22 ms -64.6%
200 (100% Error) 15.90 ms Harper: 100%
Success
2,000 (100% Error) 41.26 ms Harper: 100%
Success

*Microservices experienced partial failures even at 20 VUs for PLP and PDP.


Saturation Summary: Microservices Failure Points

Page Type Failure
Concurrency
Error Rate Error Types
Homepage 2,000 VUs 0.02% Minor connection
timeouts
PLP 20 VUs 35.3% Timeouts, connection
pool exhaustion
200 VUs 100% Complete service
failure
2,000 VUs 100% Complete service
failure
PDP 20 VUs 3.3% Intermittent timeouts
200 VUs 100% Complete service
failure
2,000 VUs 100% Complete service
failure

Critical observation: These failures occurred on localhost, where network latency is near-zero. In production environments with real network infrastructure, the saturation point would occur at even lower concurrency levels.

Key Takeaways

  1. Sub-millisecond performance at scale: Harper achieved p50 latencies below 1ms for homepage and PDP scenarios even at 2,000 concurrent users, demonstrating that unified architectures can maintain exceptional performance under load.
  1. Catastrophic microservices failure: The microservices stack reached 100% error rates at 200 VUs for PLP and PDP, indicating an architectural ceiling rather than a tuning problem. The coordination overhead of distributed services created a hard limit on scalability.
  1. Localhost is best-case for microservices: All tests ran on localhost, eliminating physical network latency. Production deployments would show even larger performance gaps due to network overhead, making the unified architecture's advantage more pronounced.

Interpretation

The Fragmentation Tax

Every request in the microservices architecture pays a "fragmentation tax" consisting of:

  • Network serialization: Data must be encoded to JSON, transmitted over HTTP, and decoded at each service boundary
  • Connection overhead: TCP handshakes, connection pool management, and socket operations for each inter-service call
  • Coordination latency: The BFF must wait for responses from multiple services before assembling the final response
  • Context switching: Each service runs in a separate process, requiring OS-level context switches

Even on localhost with near-zero network latency, these overheads compound. A single PDP request requires the BFF to call 4 services (Catalog, Inventory, Pricing, Reviews), resulting in 4 network round-trips, 8 JSON serialization operations, and coordination logic to merge results. This overhead is visible even at 20 VUs, where microservices show 3-10x higher latency than Harper.

Why Harper Stays Low-Latency

Harper's unified runtime eliminates the fragmentation tax entirely:

  • Shared memory space: Application logic and data coexist in the same process, allowing direct memory references instead of network calls
  • Zero internal serialization: Data flows between layers as native JavaScript objects, not JSON strings
  • Single-operation aggregation: Complex queries that join products, inventory, pricing, and reviews execute as single database operations without coordination logic
  • Unified resource management: A single event loop and connection pool eliminate the overhead of managing multiple processes

The result is sub-millisecond p50 latency for most scenarios, even under high concurrency. Harper's performance profile remains remarkably flat as load increases, demonstrating architectural resilience.

Failure Is Overhead, Not Defect

The 100% error rates observed in the microservices stack at 200+ VUs are not bugs—they are architectural saturation. The distributed coordination required to serve each request creates a ceiling on throughput:

  • Connection pool exhaustion: Each service has a finite connection pool; under high load, services wait for available connections
  • Timeout cascades: When one service slows down, upstream services time out waiting for responses, triggering retries that amplify load
  • Coordination bottleneck: The BFF becomes a coordination bottleneck, spending CPU cycles managing concurrent requests to multiple downstream services

This saturation occurred on localhost, where network latency is minimal. In production, with real network infrastructure, the saturation point would occur at much lower concurrency levels.

Business Implication

Research shows that ~100ms of additional latency correlates with ~1% reduction in conversion rates for e-commerce applications. The performance gaps observed in this benchmark translate directly to business impact:

  • At 200 VUs, Harper's homepage responds in 0.88ms (p50) vs. microservices' 7.07ms—a 6.19ms difference that could impact user experience
  • For PLP and PDP, microservices fail entirely at 200+ VUs, resulting in 100% lost conversions during peak traffic

While this benchmark does not measure conversion rates directly, the latency and reliability differences suggest meaningful business consequences for high-traffic e-commerce platforms.

Limitations

  • Localhost best-case: All tests ran on localhost, eliminating physical network latency. This represents the best possible scenario for microservices. Production environments with real network infrastructure would show larger performance gaps.
  • Workload-specific: This benchmark focuses on e-commerce page loads with data aggregation. Other workloads (e.g., write-heavy operations, long-running transactions) may show different performance characteristics.
  • Not a universal verdict: Microservices architectures offer benefits beyond raw performance, including organizational boundaries, independent deployment, and technology diversity. This benchmark isolates performance and does not evaluate those trade-offs.
  • Measurement precision: k6's summary output provides p90 and p95 percentiles but not full percentile distributions (P0-P98). Detailed tail latency analysis would require raw timeseries data or custom k6 output configurations.
  • Single hardware platform: All tests ran on Apple M1 Max. Results may vary on different CPU architectures, memory configurations, or operating systems.

Conclusion

This benchmark demonstrates that architectural choices have profound performance implications. The Harper unified runtime achieved sub-millisecond p50 latency and 100% success rates across all test scenarios, while the microservices stack experienced catastrophic failures at moderate concurrency levels (200+ VUs) and showed 3-10x higher latency even at low loads.

The performance gap stems from fundamental architectural differences: Harper eliminates network hops, serialization overhead, and distributed coordination by colocating application logic and data in a single runtime. The microservices stack, despite running on localhost in ideal conditions, pays a "fragmentation tax" on every request.

For organizations prioritizing performance, system efficiency, and operational simplicity, the unified runtime model offers clear advantages. For teams that value organizational boundaries, independent deployment cycles, and technology diversity, microservices may justify the performance trade-off. This benchmark provides data to inform that decision, showing the concrete costs of distributed architectures in latency, reliability, and scalability.

Readers should interpret these results in context: localhost testing represents the best-case scenario for microservices. Production environments with real network infrastructure would amplify the performance gaps observed here, making the unified architecture's advantages even more pronounced.

Explore the complete source code and k6 benchmarks used to test both architectures side-by-side: https://github.com/HarperFast/harper-vs-microservices-perf-test 

Executive Summary

This benchmark compares two functionally equivalent e-commerce applications built with different architectural approaches: a Unified Harper Runtime and a Standard Microservices Stack. Both implementations serve identical data models, UI components, and user workflows, differing only in their underlying architecture. Testing was conducted using k6 across three page types—Homepage, Product Listing Page (PLP), and Product Detail Page (PDP)—at concurrency levels of 20, 200, and 2,000 virtual users (VUs).

Key findings: Harper demonstrated sub-millisecond p50 latency (0.73ms) on the homepage at 2,000 VUs, while maintaining 100% success rates across all tests. The microservices stack exhibited catastrophic failure at 200+ VUs for PLP and PDP scenarios, reaching 100% error rates due to timeout and connection exhaustion. Even at low concurrency (20 VUs), the microservices architecture showed 3-10x higher latency due to network serialization overhead between services.

It's important to note that these tests were conducted on localhost, representing a best-case scenario for microservices. In production environments with physical network infrastructure, routers, switches, and inter-datacenter links, the performance gap would be significantly larger.

Note: Harper is designed as a distributed system. Each node can process requests end-to-end on its own. Distribution comes from running multiple nodes across regions, ensuring users always have low-latency access to data. Harper automatically handles the complexity of keeping all nodes in sync.

What This Repo Contains

  • Two functionally equivalent e-commerce applications: One built on Harper's unified runtime, one on a standard microservices architecture
  • Interactive UI with architecture toggle: Allows visual confirmation of functional parity between implementations
  • Comprehensive k6 benchmark suite: Automated testing across multiple page types and concurrency levels
  • Complete test result artifacts: JSON outputs from all benchmark runs (18 test scenarios)
  • This detailed write-up: Analysis, methodology, and interpretation of results

Architecture Overview

Microservices Stack

The standard microservices implementation follows conventional distributed architecture patterns:

  • Four independent services: Catalog, Inventory, Pricing, and Reviews services, each running as separate Fastify processes
  • Backend-for-Frontend (BFF) Gateway: Orchestrates requests across services, aggregating data from multiple sources
  • Separate MongoDB collections: Each service maintains its own database collection, enforcing data isolation
  • Network hops on every request: Even on localhost, each page load requires HTTP calls from the BFF to 2-4 backend services, with JSON serialization/deserialization at each boundary
  • Connection pooling overhead: Each service maintains its own database connection pool and HTTP client pool

Harper Unified Runtime

Harper's architecture eliminates distributed system overhead through unification:

  • Single runtime process: Database, application logic, and caching coexist in the same memory space
  • Direct memory access: Application code interacts with data via in-process references, not network calls
  • Zero internal serialization: Data flows between layers without JSON encoding/decoding
  • Efficient aggregation: Complex queries that join across entities (products + inventory + pricing + reviews) execute as single operations without network round-trips
  • Unified resource management: Single connection pool, single event loop, single memory heap

Why This Comparison Is Valid

  • Identical data model: Both stacks use the same product catalog (~1,200 items), with identical relationships between products, inventory, pricing, and reviews
  • Identical API contracts: Both expose the same REST endpoints with identical JSON response structures
  • Identical UI: Both serve the same static frontend assets and page templates
  • Same hardware: All tests run on the same Apple M1 Max machine with 32GB RAM
  • Only architecture differs: The sole variable is how the application logic, data persistence, and aggregation are organized—unified vs. distributed

Test Environment

Hardware:

  • Apple M1 Max (10-core CPU)
  • 32GB RAM
  • macOS

Network Context: All tests were conducted on localhost. This represents the best-case scenario for microservices, as it eliminates:

  • Physical network latency (typically 0.1-1ms within a datacenter)
  • Network switch/router overhead
  • Packet loss and retransmission
  • Network congestion
  • Cross-availability-zone or cross-region latency

In production environments, microservices would face additional overhead from physical network infrastructure, load balancers, service meshes, and potential cross-datacenter communication. The performance gaps observed in this localhost benchmark would be significantly larger in real-world deployments.

Methodology

Load Testing Tool: k6 (open-source load testing framework)

Concurrency Levels: Tests were executed at three concurrency levels:

  • 20 VUs: Low load, representing typical daytime traffic
  • 200 VUs: Medium load, simulating peak shopping hours
  • 2,000 VUs: High load, stress-testing architectural limits

Metrics Collected:

  • p50 (median) latency: The latency at which 50% of requests complete
  • p95 latency: The latency at which 95% of requests complete (captures tail latency)
  • Throughput: Requests per second
  • Error rate: Percentage of failed requests (timeouts, connection failures, 5xx errors)

Test Scenarios:

  • Homepage: Displays category tiles and hero content
  • PLP (Product Listing Page): Shows 20-50 products per category with basic details
  • PDP (Product Detail Page): Aggregates product data, inventory, pricing, and reviews

Note on Errors: High error rates in the microservices stack at 200+ VUs indicate saturation—the architecture reached its coordination limits, resulting in timeouts and connection pool exhaustion. This is not a code bug but an architectural ceiling.

Results

Highlights

6.1 Summary Table: p50 Latency by Page Type and Concurrency

Page Type Concurrency
(VUs)
Microservices
Stack
Harper Unified
Stack
Performance
Gap
Homepage 20 7.25 ms 2.94 ms -59.4%
200 7.07 ms 0.88 ms -87.6%
2,000 2.31 ms 0.73 ms -68.4%
PLP 20 40,740.46 ms* 5.04 ms -99.99%
200 (100% Error) 3.13 ms Harper: 100%
Success
2,000 (100% Error) 244.69 ms Harper: 100%
Success
PDP 20 8.62 ms* 3.12 ms -63.9%
200 (100% Error) 1.39 ms Harper: 100%
Success
2,000 (100% Error) 2.01 ms Harper: 100%
Success

*Microservices experienced partial failures (3.3-35.3% error rates) even at 20 VUs for PLP and PDP.

6.2 Summary Table: p95 Latency by Page Type and Concurrency

Page Type Concurrency
(VUs)
Microservices
Stack
Harper Unified
Stack
Performance
Gap
Homepage 20 10.21 ms 6.55 ms -35.8%
200 15.37 ms 7.04 ms -54.2%
2,000 45.45 ms 15.03 ms -66.9%
PLP 20 59,997.80 ms* 15.68 ms -99.97%
200 (100% Error) 44.98 ms Harper: 100%
Success
2,000 (100% Error) 845.63 ms Harper: 100%
Success
PDP 20 20.43 ms* 7.22 ms -64.6%
200 (100% Error) 15.90 ms Harper: 100%
Success
2,000 (100% Error) 41.26 ms Harper: 100%
Success

*Microservices experienced partial failures even at 20 VUs for PLP and PDP.


Saturation Summary: Microservices Failure Points

Page Type Failure
Concurrency
Error Rate Error Types
Homepage 2,000 VUs 0.02% Minor connection
timeouts
PLP 20 VUs 35.3% Timeouts, connection
pool exhaustion
200 VUs 100% Complete service
failure
2,000 VUs 100% Complete service
failure
PDP 20 VUs 3.3% Intermittent timeouts
200 VUs 100% Complete service
failure
2,000 VUs 100% Complete service
failure

Critical observation: These failures occurred on localhost, where network latency is near-zero. In production environments with real network infrastructure, the saturation point would occur at even lower concurrency levels.

Key Takeaways

  1. Sub-millisecond performance at scale: Harper achieved p50 latencies below 1ms for homepage and PDP scenarios even at 2,000 concurrent users, demonstrating that unified architectures can maintain exceptional performance under load.
  1. Catastrophic microservices failure: The microservices stack reached 100% error rates at 200 VUs for PLP and PDP, indicating an architectural ceiling rather than a tuning problem. The coordination overhead of distributed services created a hard limit on scalability.
  1. Localhost is best-case for microservices: All tests ran on localhost, eliminating physical network latency. Production deployments would show even larger performance gaps due to network overhead, making the unified architecture's advantage more pronounced.

Interpretation

The Fragmentation Tax

Every request in the microservices architecture pays a "fragmentation tax" consisting of:

  • Network serialization: Data must be encoded to JSON, transmitted over HTTP, and decoded at each service boundary
  • Connection overhead: TCP handshakes, connection pool management, and socket operations for each inter-service call
  • Coordination latency: The BFF must wait for responses from multiple services before assembling the final response
  • Context switching: Each service runs in a separate process, requiring OS-level context switches

Even on localhost with near-zero network latency, these overheads compound. A single PDP request requires the BFF to call 4 services (Catalog, Inventory, Pricing, Reviews), resulting in 4 network round-trips, 8 JSON serialization operations, and coordination logic to merge results. This overhead is visible even at 20 VUs, where microservices show 3-10x higher latency than Harper.

Why Harper Stays Low-Latency

Harper's unified runtime eliminates the fragmentation tax entirely:

  • Shared memory space: Application logic and data coexist in the same process, allowing direct memory references instead of network calls
  • Zero internal serialization: Data flows between layers as native JavaScript objects, not JSON strings
  • Single-operation aggregation: Complex queries that join products, inventory, pricing, and reviews execute as single database operations without coordination logic
  • Unified resource management: A single event loop and connection pool eliminate the overhead of managing multiple processes

The result is sub-millisecond p50 latency for most scenarios, even under high concurrency. Harper's performance profile remains remarkably flat as load increases, demonstrating architectural resilience.

Failure Is Overhead, Not Defect

The 100% error rates observed in the microservices stack at 200+ VUs are not bugs—they are architectural saturation. The distributed coordination required to serve each request creates a ceiling on throughput:

  • Connection pool exhaustion: Each service has a finite connection pool; under high load, services wait for available connections
  • Timeout cascades: When one service slows down, upstream services time out waiting for responses, triggering retries that amplify load
  • Coordination bottleneck: The BFF becomes a coordination bottleneck, spending CPU cycles managing concurrent requests to multiple downstream services

This saturation occurred on localhost, where network latency is minimal. In production, with real network infrastructure, the saturation point would occur at much lower concurrency levels.

Business Implication

Research shows that ~100ms of additional latency correlates with ~1% reduction in conversion rates for e-commerce applications. The performance gaps observed in this benchmark translate directly to business impact:

  • At 200 VUs, Harper's homepage responds in 0.88ms (p50) vs. microservices' 7.07ms—a 6.19ms difference that could impact user experience
  • For PLP and PDP, microservices fail entirely at 200+ VUs, resulting in 100% lost conversions during peak traffic

While this benchmark does not measure conversion rates directly, the latency and reliability differences suggest meaningful business consequences for high-traffic e-commerce platforms.

Limitations

  • Localhost best-case: All tests ran on localhost, eliminating physical network latency. This represents the best possible scenario for microservices. Production environments with real network infrastructure would show larger performance gaps.
  • Workload-specific: This benchmark focuses on e-commerce page loads with data aggregation. Other workloads (e.g., write-heavy operations, long-running transactions) may show different performance characteristics.
  • Not a universal verdict: Microservices architectures offer benefits beyond raw performance, including organizational boundaries, independent deployment, and technology diversity. This benchmark isolates performance and does not evaluate those trade-offs.
  • Measurement precision: k6's summary output provides p90 and p95 percentiles but not full percentile distributions (P0-P98). Detailed tail latency analysis would require raw timeseries data or custom k6 output configurations.
  • Single hardware platform: All tests ran on Apple M1 Max. Results may vary on different CPU architectures, memory configurations, or operating systems.

Conclusion

This benchmark demonstrates that architectural choices have profound performance implications. The Harper unified runtime achieved sub-millisecond p50 latency and 100% success rates across all test scenarios, while the microservices stack experienced catastrophic failures at moderate concurrency levels (200+ VUs) and showed 3-10x higher latency even at low loads.

The performance gap stems from fundamental architectural differences: Harper eliminates network hops, serialization overhead, and distributed coordination by colocating application logic and data in a single runtime. The microservices stack, despite running on localhost in ideal conditions, pays a "fragmentation tax" on every request.

For organizations prioritizing performance, system efficiency, and operational simplicity, the unified runtime model offers clear advantages. For teams that value organizational boundaries, independent deployment cycles, and technology diversity, microservices may justify the performance trade-off. This benchmark provides data to inform that decision, showing the concrete costs of distributed architectures in latency, reliability, and scalability.

Readers should interpret these results in context: localhost testing represents the best-case scenario for microservices. Production environments with real network infrastructure would amplify the performance gaps observed here, making the unified architecture's advantages even more pronounced.

A detailed performance benchmark comparing a traditional microservices architecture with Harper’s unified runtime. Using a real, fully functional e-commerce application, this report examines latency, scalability, and architectural overhead across homepage, category, and product pages, highlighting the real-world performance implications between two different styles of distributed systems.

Download

White arrow pointing right
A detailed performance benchmark comparing a traditional microservices architecture with Harper’s unified runtime. Using a real, fully functional e-commerce application, this report examines latency, scalability, and architectural overhead across homepage, category, and product pages, highlighting the real-world performance implications between two different styles of distributed systems.

Download

White arrow pointing right
A detailed performance benchmark comparing a traditional microservices architecture with Harper’s unified runtime. Using a real, fully functional e-commerce application, this report examines latency, scalability, and architectural overhead across homepage, category, and product pages, highlighting the real-world performance implications between two different styles of distributed systems.

Download

White arrow pointing right

Explore Recent Resources

Livestream
GitHub Logo

1.5 Hour Build - Vibe Coding a Full Personal Site: Design to Deployment in One Session

Watch Austin rebuild his personal website live using Claude AI and Harper, including a custom Markdown CMS, GraphQL schema design, React scaffolding, and full deployment. A real-time pair coding session from design to launch.
Livestream
Watch Austin rebuild his personal website live using Claude AI and Harper, including a custom Markdown CMS, GraphQL schema design, React scaffolding, and full deployment. A real-time pair coding session from design to launch.
Person with short hair wearing a light blue patterned shirt, smiling widely outdoors with blurred greenery and trees in the background.
Austin Akers
Head of Developer Relations
Livestream

1.5 Hour Build - Vibe Coding a Full Personal Site: Design to Deployment in One Session

Watch Austin rebuild his personal website live using Claude AI and Harper, including a custom Markdown CMS, GraphQL schema design, React scaffolding, and full deployment. A real-time pair coding session from design to launch.
Austin Akers
May 2026
Livestream

1.5 Hour Build - Vibe Coding a Full Personal Site: Design to Deployment in One Session

Watch Austin rebuild his personal website live using Claude AI and Harper, including a custom Markdown CMS, GraphQL schema design, React scaffolding, and full deployment. A real-time pair coding session from design to launch.
Austin Akers
Livestream

1.5 Hour Build - Vibe Coding a Full Personal Site: Design to Deployment in One Session

Watch Austin rebuild his personal website live using Claude AI and Harper, including a custom Markdown CMS, GraphQL schema design, React scaffolding, and full deployment. A real-time pair coding session from design to launch.
Austin Akers
Blog
GitHub Logo

The Old Product Loop Is the New Bottleneck

AI has made it dramatically cheaper to get software to a working version, but most companies still plan like building is the expensive part. The new bottleneck is the product loop: forming sharp hypotheses, living inside the user experience, fixing friction as it appears, and feeding evidence back into the roadmap faster than ticket-based planning allows.
Blog
AI has made it dramatically cheaper to get software to a working version, but most companies still plan like building is the expensive part. The new bottleneck is the product loop: forming sharp hypotheses, living inside the user experience, fixing friction as it appears, and feeding evidence back into the roadmap faster than ticket-based planning allows.
Person with short dark hair and moustache, wearing a colorful plaid shirt, smiling outdoors in a forested mountain landscape.
Aleks Haugom
Senior Manager of GTM
Blog

The Old Product Loop Is the New Bottleneck

AI has made it dramatically cheaper to get software to a working version, but most companies still plan like building is the expensive part. The new bottleneck is the product loop: forming sharp hypotheses, living inside the user experience, fixing friction as it appears, and feeding evidence back into the roadmap faster than ticket-based planning allows.
Aleks Haugom
May 2026
Blog

The Old Product Loop Is the New Bottleneck

AI has made it dramatically cheaper to get software to a working version, but most companies still plan like building is the expensive part. The new bottleneck is the product loop: forming sharp hypotheses, living inside the user experience, fixing friction as it appears, and feeding evidence back into the roadmap faster than ticket-based planning allows.
Aleks Haugom
Blog

The Old Product Loop Is the New Bottleneck

AI has made it dramatically cheaper to get software to a working version, but most companies still plan like building is the expensive part. The new bottleneck is the product loop: forming sharp hypotheses, living inside the user experience, fixing friction as it appears, and feeding evidence back into the roadmap faster than ticket-based planning allows.
Aleks Haugom
Livestream
GitHub Logo

2 Hour Build - Live Stream for Non-Developers

A non-developer's live stream walkthrough of building Flow State, a Colorado river-flow app for rafters, in two hours using ChatGPT dictation, Claude Code, Claude Design, and Harper. Scaffold with npm create harper@latest and deploy to Harper Fabric. No coding background required.
Livestream
A non-developer's live stream walkthrough of building Flow State, a Colorado river-flow app for rafters, in two hours using ChatGPT dictation, Claude Code, Claude Design, and Harper. Scaffold with npm create harper@latest and deploy to Harper Fabric. No coding background required.
Person with short dark hair and moustache, wearing a colorful plaid shirt, smiling outdoors in a forested mountain landscape.
Aleks Haugom
Senior Manager of GTM
Livestream

2 Hour Build - Live Stream for Non-Developers

A non-developer's live stream walkthrough of building Flow State, a Colorado river-flow app for rafters, in two hours using ChatGPT dictation, Claude Code, Claude Design, and Harper. Scaffold with npm create harper@latest and deploy to Harper Fabric. No coding background required.
Aleks Haugom
May 2026
Livestream

2 Hour Build - Live Stream for Non-Developers

A non-developer's live stream walkthrough of building Flow State, a Colorado river-flow app for rafters, in two hours using ChatGPT dictation, Claude Code, Claude Design, and Harper. Scaffold with npm create harper@latest and deploy to Harper Fabric. No coding background required.
Aleks Haugom
Livestream

2 Hour Build - Live Stream for Non-Developers

A non-developer's live stream walkthrough of building Flow State, a Colorado river-flow app for rafters, in two hours using ChatGPT dictation, Claude Code, Claude Design, and Harper. Scaffold with npm create harper@latest and deploy to Harper Fabric. No coding background required.
Aleks Haugom
Tutorial
GitHub Logo

Production Quality at Vibe Code Velocity: Dispatched Agent Teams with Harper

Harper enables production-grade agentic engineering by collapsing database, cache, runtime, and messaging into one process, reducing agent complexity and review burden. A multi-model dispatch workflow lets specialized agents plan, code, QA, and review in parallel while humans retain control over critical decisions.
Tutorial
Harper enables production-grade agentic engineering by collapsing database, cache, runtime, and messaging into one process, reducing agent complexity and review burden. A multi-model dispatch workflow lets specialized agents plan, code, QA, and review in parallel while humans retain control over critical decisions.
Person with very short hair and a goatee wearing a plaid button‑up shirt over a white undershirt, smiling outdoors with leafy greenery behind.
Jeff Darnton
SVP, Professional Services & Customer Success
Tutorial

Production Quality at Vibe Code Velocity: Dispatched Agent Teams with Harper

Harper enables production-grade agentic engineering by collapsing database, cache, runtime, and messaging into one process, reducing agent complexity and review burden. A multi-model dispatch workflow lets specialized agents plan, code, QA, and review in parallel while humans retain control over critical decisions.
Jeff Darnton
May 2026
Tutorial

Production Quality at Vibe Code Velocity: Dispatched Agent Teams with Harper

Harper enables production-grade agentic engineering by collapsing database, cache, runtime, and messaging into one process, reducing agent complexity and review burden. A multi-model dispatch workflow lets specialized agents plan, code, QA, and review in parallel while humans retain control over critical decisions.
Jeff Darnton
Tutorial

Production Quality at Vibe Code Velocity: Dispatched Agent Teams with Harper

Harper enables production-grade agentic engineering by collapsing database, cache, runtime, and messaging into one process, reducing agent complexity and review burden. A multi-model dispatch workflow lets specialized agents plan, code, QA, and review in parallel while humans retain control over critical decisions.
Jeff Darnton
Tutorial
GitHub Logo

Change Data Capture Into a Runtime: One Pipeline for Pages, Search, and AI Agents

Learn how Harper turns CDC streams into real-time workflows that refresh cached pages, update search indexes, and keep AI agent context current. See why landing changes in an application runtime beats warehouses, queues, and traditional CDNs.
Tutorial
Learn how Harper turns CDC streams into real-time workflows that refresh cached pages, update search indexes, and keep AI agent context current. See why landing changes in an application runtime beats warehouses, queues, and traditional CDNs.
Person with very short hair and a goatee wearing a plaid button‑up shirt over a white undershirt, smiling outdoors with leafy greenery behind.
Jeff Darnton
SVP, Professional Services & Customer Success
Tutorial

Change Data Capture Into a Runtime: One Pipeline for Pages, Search, and AI Agents

Learn how Harper turns CDC streams into real-time workflows that refresh cached pages, update search indexes, and keep AI agent context current. See why landing changes in an application runtime beats warehouses, queues, and traditional CDNs.
Jeff Darnton
May 2026
Tutorial

Change Data Capture Into a Runtime: One Pipeline for Pages, Search, and AI Agents

Learn how Harper turns CDC streams into real-time workflows that refresh cached pages, update search indexes, and keep AI agent context current. See why landing changes in an application runtime beats warehouses, queues, and traditional CDNs.
Jeff Darnton
Tutorial

Change Data Capture Into a Runtime: One Pipeline for Pages, Search, and AI Agents

Learn how Harper turns CDC streams into real-time workflows that refresh cached pages, update search indexes, and keep AI agent context current. See why landing changes in an application runtime beats warehouses, queues, and traditional CDNs.
Jeff Darnton
Tutorial
GitHub Logo

Harper + Vertex AI: The Architecture Every Agent Builder Should Know

Production agents bleed tokens and latency on repeated queries. Pair a managed model layer with a vector-indexed data layer at the edge, and an 80% cache hit rate cuts LLM spend by 80% while delivering sub-100ms responses on semantically similar requests.
Tutorial
Production agents bleed tokens and latency on repeated queries. Pair a managed model layer with a vector-indexed data layer at the edge, and an 80% cache hit rate cuts LLM spend by 80% while delivering sub-100ms responses on semantically similar requests.
Person with styled reddish‑brown hair and a full beard wearing a gray suit with a light blue shirt and dark green tie, posing outdoors with a blurred pathway and greenery behind.
Drew Chambers
CMO
Tutorial

Harper + Vertex AI: The Architecture Every Agent Builder Should Know

Production agents bleed tokens and latency on repeated queries. Pair a managed model layer with a vector-indexed data layer at the edge, and an 80% cache hit rate cuts LLM spend by 80% while delivering sub-100ms responses on semantically similar requests.
Drew Chambers
May 2026
Tutorial

Harper + Vertex AI: The Architecture Every Agent Builder Should Know

Production agents bleed tokens and latency on repeated queries. Pair a managed model layer with a vector-indexed data layer at the edge, and an 80% cache hit rate cuts LLM spend by 80% while delivering sub-100ms responses on semantically similar requests.
Drew Chambers
Tutorial

Harper + Vertex AI: The Architecture Every Agent Builder Should Know

Production agents bleed tokens and latency on repeated queries. Pair a managed model layer with a vector-indexed data layer at the edge, and an 80% cache hit rate cuts LLM spend by 80% while delivering sub-100ms responses on semantically similar requests.
Drew Chambers