Click Below to Get the Code

Browse, clone, and build from real-world templates powered by Harper.
Blog
GitHub Logo

What’s New in Harper 4.6: A Deep Dive Into Vector Indexing, Logging, and Performance

Harper 4.6 introduces native vector indexing (HNSW), developer-first logging, and a plugins API—boosting semantic search, debugging, and modularity, all with minimal performance impact. Ideal for AI-native apps and scaling with ease.
Blog

What’s New in Harper 4.6: A Deep Dive Into Vector Indexing, Logging, and Performance

By
Nenne Nwodo
July 9, 2025
By
Nenne Nwodo
July 9, 2025
July 9, 2025
Harper 4.6 introduces native vector indexing (HNSW), developer-first logging, and a plugins API—boosting semantic search, debugging, and modularity, all with minimal performance impact. Ideal for AI-native apps and scaling with ease.
Nenne Nwodo
Developer Relations

Harper 4.6 introduces a powerful trio of enhancements designed for modern application builders: native vector indexing, granular developer-first logging, and performance improvements optimized for global scale. In this Q&A-style companion to our latest webinar, Harper Field CTO Jaxon Repp sat down with Nenne Nwodo to unpack what these upgrades mean in practice—and how they simplify complex tasks without sacrificing power.

How has search evolved over the last 18 months?

Search has shifted from rigid keyword matching to flexible, AI-powered retrieval. What used to be a matter of SQL LIKE queries has now become a matter of meaning—semantic search. As data grows and user expectations rise, developers need search systems that are both fast and meaning-aware.

What’s the difference between semantic search and vector search?

Semantic search is the umbrella term—it's about retrieving content based on meaning rather than exact matches. Vector search is a technique within that umbrella. By transforming data into high-dimensional vectors, it allows systems to compute "closeness" between concepts, not just keywords.

Why does Harper 4.6 use HNSW for vector indexing?

HNSW (Hierarchical Navigable Small World) offers the best trade-off between performance and accuracy. It structures vector data in a way that allows Harper to quickly retrieve approximate nearest neighbors without exhaustive comparisons. This keeps search fast even as datasets grow.

Are other indexing algorithms supported?

HNSW is the default because it’s well-established and broadly effective. However, Harper’s modular design makes it easy to add custom indexing algorithms if your data or search patterns demand it. Support for multiple embedding models—like OpenAI or local Ollama models—is already built in.

What are some real-world use cases for vector search?

  • E-commerce recommendations: Search by intent, not rigid product filters.
  • Code similarity detection: Identify related code snippets or modules.
  • Video indexing: Search by visual or spoken content in real time.
  • Personalized recommendations: Match patterns in taste or behavior, not just explicit metadata.

What makes Harper's hybrid search capabilities unique?

Harper’s query planner dynamically decides whether to apply vector or attribute filters first, based on dataset cardinality. This hybrid approach blends structured and unstructured search for more accurate and performant results—all abstracted behind a simple developer experience.

How does Harper handle updates or deletions in vector indexes?

Vector embeddings are recalculated automatically on record updates using Harper’s dynamic field feature. Deletions are cleanly removed from the index. Even bulk re-indexing is supported for cases like swapping embedding models or tuning algorithm parameters.

What’s new in Harper 4.6 logging and why does it matter?

4.6 introduces a revamped, developer-friendly logging system:

  • Full HTTP request tracing through the entire stack
  • Per-component configurability
  • Log shape customization
  • Live updates to logging settings without restarts

This is especially valuable for debugging complex workflows in production, without compromising system stability.

What’s the new Plugins API, and how is it different from extensions?

Plugins replace extensions as the go-forward abstraction for reusable logic. Unlike extensions, plugins:

  • Are dynamically loaded
  • Register via a single-handle method
  • Avoid being loaded by components that don’t need them
  • Simplify how components interface with shared functionality

Extensions are still supported for now, but will eventually be deprecated.

How does Harper address performance degradation over time?

Two key challenges affect long-term performance: massive data growth and rising user expectations. Harper addresses both with:

  • Horizontal scale and intelligent sharding
  • Real-time indexing optimizations
  • Component-level replication
  • A query planner that adapts based on data and workload

All while preserving the simplicity developers expect from document databases like MongoDB.

What’s the performance cost of enhanced logging?

Virtually none—unless you deliberately configure it that way. Logging in 4.6 is opt-in, per component, and can be fine-tuned for shape and frequency. Harper minimizes disk writes and avoids logging overload, ensuring observability without sacrificing performance.

What are common gotchas when migrating from multi-system architectures?

The most common friction comes from mindset shifts, not technical blockers. Harper consolidates multi-service stacks into a single, composable platform, eliminating synchronization issues and latency from chained services. The challenge is unlearning old patterns—once users build their first endpoint, the benefits quickly become obvious.

How important are GPUs in Harper's vector pipeline?

GPUs accelerate the generation of vector embeddings, especially at scale. But Harper supports CPU-based embedding, local testing, and token-based APIs (like OpenAI) as well. You choose the performance/cost tradeoff that fits your use case.

How does Harper handle conflict resolution in active-active replication?

Harper uses CRDTs with versioning and last-writer-wins logic to resolve write conflicts—though simultaneous microsecond-level updates on the same record are rare. This ensures data consistency across distributed nodes without interrupting performance.

Final Takeaway

Harper 4.6 represents a leap forward in developer experience, search capability, and system visibility. Whether you’re building AI-native search experiences or maintaining mission-critical APIs, this release gives you the tools to simplify your stack and scale with confidence.

Check out the docs and try the new features. Feedback? Questions? We’re on LinkedIn, X, Threads, BlueSky, and Slack. You can also contact us directly through our contact form

Harper 4.6 introduces a powerful trio of enhancements designed for modern application builders: native vector indexing, granular developer-first logging, and performance improvements optimized for global scale. In this Q&A-style companion to our latest webinar, Harper Field CTO Jaxon Repp sat down with Nenne Nwodo to unpack what these upgrades mean in practice—and how they simplify complex tasks without sacrificing power.

How has search evolved over the last 18 months?

Search has shifted from rigid keyword matching to flexible, AI-powered retrieval. What used to be a matter of SQL LIKE queries has now become a matter of meaning—semantic search. As data grows and user expectations rise, developers need search systems that are both fast and meaning-aware.

What’s the difference between semantic search and vector search?

Semantic search is the umbrella term—it's about retrieving content based on meaning rather than exact matches. Vector search is a technique within that umbrella. By transforming data into high-dimensional vectors, it allows systems to compute "closeness" between concepts, not just keywords.

Why does Harper 4.6 use HNSW for vector indexing?

HNSW (Hierarchical Navigable Small World) offers the best trade-off between performance and accuracy. It structures vector data in a way that allows Harper to quickly retrieve approximate nearest neighbors without exhaustive comparisons. This keeps search fast even as datasets grow.

Are other indexing algorithms supported?

HNSW is the default because it’s well-established and broadly effective. However, Harper’s modular design makes it easy to add custom indexing algorithms if your data or search patterns demand it. Support for multiple embedding models—like OpenAI or local Ollama models—is already built in.

What are some real-world use cases for vector search?

  • E-commerce recommendations: Search by intent, not rigid product filters.
  • Code similarity detection: Identify related code snippets or modules.
  • Video indexing: Search by visual or spoken content in real time.
  • Personalized recommendations: Match patterns in taste or behavior, not just explicit metadata.

What makes Harper's hybrid search capabilities unique?

Harper’s query planner dynamically decides whether to apply vector or attribute filters first, based on dataset cardinality. This hybrid approach blends structured and unstructured search for more accurate and performant results—all abstracted behind a simple developer experience.

How does Harper handle updates or deletions in vector indexes?

Vector embeddings are recalculated automatically on record updates using Harper’s dynamic field feature. Deletions are cleanly removed from the index. Even bulk re-indexing is supported for cases like swapping embedding models or tuning algorithm parameters.

What’s new in Harper 4.6 logging and why does it matter?

4.6 introduces a revamped, developer-friendly logging system:

  • Full HTTP request tracing through the entire stack
  • Per-component configurability
  • Log shape customization
  • Live updates to logging settings without restarts

This is especially valuable for debugging complex workflows in production, without compromising system stability.

What’s the new Plugins API, and how is it different from extensions?

Plugins replace extensions as the go-forward abstraction for reusable logic. Unlike extensions, plugins:

  • Are dynamically loaded
  • Register via a single-handle method
  • Avoid being loaded by components that don’t need them
  • Simplify how components interface with shared functionality

Extensions are still supported for now, but will eventually be deprecated.

How does Harper address performance degradation over time?

Two key challenges affect long-term performance: massive data growth and rising user expectations. Harper addresses both with:

  • Horizontal scale and intelligent sharding
  • Real-time indexing optimizations
  • Component-level replication
  • A query planner that adapts based on data and workload

All while preserving the simplicity developers expect from document databases like MongoDB.

What’s the performance cost of enhanced logging?

Virtually none—unless you deliberately configure it that way. Logging in 4.6 is opt-in, per component, and can be fine-tuned for shape and frequency. Harper minimizes disk writes and avoids logging overload, ensuring observability without sacrificing performance.

What are common gotchas when migrating from multi-system architectures?

The most common friction comes from mindset shifts, not technical blockers. Harper consolidates multi-service stacks into a single, composable platform, eliminating synchronization issues and latency from chained services. The challenge is unlearning old patterns—once users build their first endpoint, the benefits quickly become obvious.

How important are GPUs in Harper's vector pipeline?

GPUs accelerate the generation of vector embeddings, especially at scale. But Harper supports CPU-based embedding, local testing, and token-based APIs (like OpenAI) as well. You choose the performance/cost tradeoff that fits your use case.

How does Harper handle conflict resolution in active-active replication?

Harper uses CRDTs with versioning and last-writer-wins logic to resolve write conflicts—though simultaneous microsecond-level updates on the same record are rare. This ensures data consistency across distributed nodes without interrupting performance.

Final Takeaway

Harper 4.6 represents a leap forward in developer experience, search capability, and system visibility. Whether you’re building AI-native search experiences or maintaining mission-critical APIs, this release gives you the tools to simplify your stack and scale with confidence.

Check out the docs and try the new features. Feedback? Questions? We’re on LinkedIn, X, Threads, BlueSky, and Slack. You can also contact us directly through our contact form

Harper 4.6 introduces native vector indexing (HNSW), developer-first logging, and a plugins API—boosting semantic search, debugging, and modularity, all with minimal performance impact. Ideal for AI-native apps and scaling with ease.

Download

White arrow pointing right
Harper 4.6 introduces native vector indexing (HNSW), developer-first logging, and a plugins API—boosting semantic search, debugging, and modularity, all with minimal performance impact. Ideal for AI-native apps and scaling with ease.

Download

White arrow pointing right
Harper 4.6 introduces native vector indexing (HNSW), developer-first logging, and a plugins API—boosting semantic search, debugging, and modularity, all with minimal performance impact. Ideal for AI-native apps and scaling with ease.

Download

White arrow pointing right

Explore Recent Resources

Blog
GitHub Logo

Answer Engine Optimization: How to Get Cited by AI Answers

Answer Engine Optimization (AEO) is the next evolution of SEO. Learn how to prepare your content for Google’s AI Overviews, Perplexity, and other answer engines. From structuring pages to governing bots, discover how to stay visible, earn citations, and capture future traffic streams.
Search Optimization
Blog
Answer Engine Optimization (AEO) is the next evolution of SEO. Learn how to prepare your content for Google’s AI Overviews, Perplexity, and other answer engines. From structuring pages to governing bots, discover how to stay visible, earn citations, and capture future traffic streams.
Colorful geometric illustration of a dog's head in shades of purple, pink and teal.
Martin Spiek
SEO Subject Matter Expert
Blog

Answer Engine Optimization: How to Get Cited by AI Answers

Answer Engine Optimization (AEO) is the next evolution of SEO. Learn how to prepare your content for Google’s AI Overviews, Perplexity, and other answer engines. From structuring pages to governing bots, discover how to stay visible, earn citations, and capture future traffic streams.
Martin Spiek
Sep 2025
Blog

Answer Engine Optimization: How to Get Cited by AI Answers

Answer Engine Optimization (AEO) is the next evolution of SEO. Learn how to prepare your content for Google’s AI Overviews, Perplexity, and other answer engines. From structuring pages to governing bots, discover how to stay visible, earn citations, and capture future traffic streams.
Martin Spiek
Blog

Answer Engine Optimization: How to Get Cited by AI Answers

Answer Engine Optimization (AEO) is the next evolution of SEO. Learn how to prepare your content for Google’s AI Overviews, Perplexity, and other answer engines. From structuring pages to governing bots, discover how to stay visible, earn citations, and capture future traffic streams.
Martin Spiek
Case Study
GitHub Logo

The Impact of Early Hints - Auto Parts

A leading U.S. auto parts retailer used Harper’s Early Hints technology to overcome Core Web Vitals failures, achieving faster load speeds, dramatically improved indexation, and an estimated $8.6M annual revenue uplift. With minimal code changes, the proof-of-concept validated that even small performance gains can unlock significant growth opportunities for large-scale e-commerce businesses.
Early Hints
Case Study
A leading U.S. auto parts retailer used Harper’s Early Hints technology to overcome Core Web Vitals failures, achieving faster load speeds, dramatically improved indexation, and an estimated $8.6M annual revenue uplift. With minimal code changes, the proof-of-concept validated that even small performance gains can unlock significant growth opportunities for large-scale e-commerce businesses.
Colorful geometric illustration of a dog's head resembling folded paper art in shades of teal and pink.
Harper
Case Study

The Impact of Early Hints - Auto Parts

A leading U.S. auto parts retailer used Harper’s Early Hints technology to overcome Core Web Vitals failures, achieving faster load speeds, dramatically improved indexation, and an estimated $8.6M annual revenue uplift. With minimal code changes, the proof-of-concept validated that even small performance gains can unlock significant growth opportunities for large-scale e-commerce businesses.
Harper
Sep 2025
Case Study

The Impact of Early Hints - Auto Parts

A leading U.S. auto parts retailer used Harper’s Early Hints technology to overcome Core Web Vitals failures, achieving faster load speeds, dramatically improved indexation, and an estimated $8.6M annual revenue uplift. With minimal code changes, the proof-of-concept validated that even small performance gains can unlock significant growth opportunities for large-scale e-commerce businesses.
Harper
Case Study

The Impact of Early Hints - Auto Parts

A leading U.S. auto parts retailer used Harper’s Early Hints technology to overcome Core Web Vitals failures, achieving faster load speeds, dramatically improved indexation, and an estimated $8.6M annual revenue uplift. With minimal code changes, the proof-of-concept validated that even small performance gains can unlock significant growth opportunities for large-scale e-commerce businesses.
Harper