Click Below to Get the Code

Browse, clone, and build from real-world templates powered by Harper.
Blog
GitHub Logo

What’s New in Harper 4.6: A Deep Dive Into Vector Indexing, Logging, and Performance

Harper 4.6 introduces native vector indexing (HNSW), developer-first logging, and a plugins API—boosting semantic search, debugging, and modularity, all with minimal performance impact. Ideal for AI-native apps and scaling with ease.
Blog

What’s New in Harper 4.6: A Deep Dive Into Vector Indexing, Logging, and Performance

By
Nenne Nwodo
July 9, 2025
By
Nenne Nwodo
July 9, 2025
By
Nenne Nwodo
July 9, 2025
July 9, 2025
Harper 4.6 introduces native vector indexing (HNSW), developer-first logging, and a plugins API—boosting semantic search, debugging, and modularity, all with minimal performance impact. Ideal for AI-native apps and scaling with ease.
Nenne Nwodo
Developer Relations

Harper 4.6 introduces a powerful trio of enhancements designed for modern application builders: native vector indexing, granular developer-first logging, and performance improvements optimized for global scale. In this Q&A-style companion to our latest webinar, Harper Field CTO Jaxon Repp sat down with Nenne Nwodo to unpack what these upgrades mean in practice—and how they simplify complex tasks without sacrificing power.

How has search evolved over the last 18 months?

Search has shifted from rigid keyword matching to flexible, AI-powered retrieval. What used to be a matter of SQL LIKE queries has now become a matter of meaning—semantic search. As data grows and user expectations rise, developers need search systems that are both fast and meaning-aware.

What’s the difference between semantic search and vector search?

Semantic search is the umbrella term—it's about retrieving content based on meaning rather than exact matches. Vector search is a technique within that umbrella. By transforming data into high-dimensional vectors, it allows systems to compute "closeness" between concepts, not just keywords.

Why does Harper 4.6 use HNSW for vector indexing?

HNSW (Hierarchical Navigable Small World) offers the best trade-off between performance and accuracy. It structures vector data in a way that allows Harper to quickly retrieve approximate nearest neighbors without exhaustive comparisons. This keeps search fast even as datasets grow.

Are other indexing algorithms supported?

HNSW is the default because it’s well-established and broadly effective. However, Harper’s modular design makes it easy to add custom indexing algorithms if your data or search patterns demand it. Support for multiple embedding models—like OpenAI or local Ollama models—is already built in.

What are some real-world use cases for vector search?

  • E-commerce recommendations: Search by intent, not rigid product filters.
  • Code similarity detection: Identify related code snippets or modules.
  • Video indexing: Search by visual or spoken content in real time.
  • Personalized recommendations: Match patterns in taste or behavior, not just explicit metadata.

What makes Harper's hybrid search capabilities unique?

Harper’s query planner dynamically decides whether to apply vector or attribute filters first, based on dataset cardinality. This hybrid approach blends structured and unstructured search for more accurate and performant results—all abstracted behind a simple developer experience.

How does Harper handle updates or deletions in vector indexes?

Vector embeddings are recalculated automatically on record updates using Harper’s dynamic field feature. Deletions are cleanly removed from the index. Even bulk re-indexing is supported for cases like swapping embedding models or tuning algorithm parameters.

What’s new in Harper 4.6 logging and why does it matter?

4.6 introduces a revamped, developer-friendly logging system:

  • Full HTTP request tracing through the entire stack
  • Per-component configurability
  • Log shape customization
  • Live updates to logging settings without restarts

This is especially valuable for debugging complex workflows in production, without compromising system stability.

What’s the new Plugins API, and how is it different from extensions?

Plugins replace extensions as the go-forward abstraction for reusable logic. Unlike extensions, plugins:

  • Are dynamically loaded
  • Register via a single-handle method
  • Avoid being loaded by components that don’t need them
  • Simplify how components interface with shared functionality

Extensions are still supported for now, but will eventually be deprecated.

How does Harper address performance degradation over time?

Two key challenges affect long-term performance: massive data growth and rising user expectations. Harper addresses both with:

  • Horizontal scale and intelligent sharding
  • Real-time indexing optimizations
  • Component-level replication
  • A query planner that adapts based on data and workload

All while preserving the simplicity developers expect from document databases like MongoDB.

What’s the performance cost of enhanced logging?

Virtually none—unless you deliberately configure it that way. Logging in 4.6 is opt-in, per component, and can be fine-tuned for shape and frequency. Harper minimizes disk writes and avoids logging overload, ensuring observability without sacrificing performance.

What are common gotchas when migrating from multi-system architectures?

The most common friction comes from mindset shifts, not technical blockers. Harper consolidates multi-service stacks into a single, composable platform, eliminating synchronization issues and latency from chained services. The challenge is unlearning old patterns—once users build their first endpoint, the benefits quickly become obvious.

How important are GPUs in Harper's vector pipeline?

GPUs accelerate the generation of vector embeddings, especially at scale. But Harper supports CPU-based embedding, local testing, and token-based APIs (like OpenAI) as well. You choose the performance/cost tradeoff that fits your use case.

How does Harper handle conflict resolution in active-active replication?

Harper uses CRDTs with versioning and last-writer-wins logic to resolve write conflicts—though simultaneous microsecond-level updates on the same record are rare. This ensures data consistency across distributed nodes without interrupting performance.

Final Takeaway

Harper 4.6 represents a leap forward in developer experience, search capability, and system visibility. Whether you’re building AI-native search experiences or maintaining mission-critical APIs, this release gives you the tools to simplify your stack and scale with confidence.

Check out the docs and try the new features. Feedback? Questions? We’re on LinkedIn, X, Threads, BlueSky, and Slack. You can also contact us directly through our contact form

Harper 4.6 introduces a powerful trio of enhancements designed for modern application builders: native vector indexing, granular developer-first logging, and performance improvements optimized for global scale. In this Q&A-style companion to our latest webinar, Harper Field CTO Jaxon Repp sat down with Nenne Nwodo to unpack what these upgrades mean in practice—and how they simplify complex tasks without sacrificing power.

How has search evolved over the last 18 months?

Search has shifted from rigid keyword matching to flexible, AI-powered retrieval. What used to be a matter of SQL LIKE queries has now become a matter of meaning—semantic search. As data grows and user expectations rise, developers need search systems that are both fast and meaning-aware.

What’s the difference between semantic search and vector search?

Semantic search is the umbrella term—it's about retrieving content based on meaning rather than exact matches. Vector search is a technique within that umbrella. By transforming data into high-dimensional vectors, it allows systems to compute "closeness" between concepts, not just keywords.

Why does Harper 4.6 use HNSW for vector indexing?

HNSW (Hierarchical Navigable Small World) offers the best trade-off between performance and accuracy. It structures vector data in a way that allows Harper to quickly retrieve approximate nearest neighbors without exhaustive comparisons. This keeps search fast even as datasets grow.

Are other indexing algorithms supported?

HNSW is the default because it’s well-established and broadly effective. However, Harper’s modular design makes it easy to add custom indexing algorithms if your data or search patterns demand it. Support for multiple embedding models—like OpenAI or local Ollama models—is already built in.

What are some real-world use cases for vector search?

  • E-commerce recommendations: Search by intent, not rigid product filters.
  • Code similarity detection: Identify related code snippets or modules.
  • Video indexing: Search by visual or spoken content in real time.
  • Personalized recommendations: Match patterns in taste or behavior, not just explicit metadata.

What makes Harper's hybrid search capabilities unique?

Harper’s query planner dynamically decides whether to apply vector or attribute filters first, based on dataset cardinality. This hybrid approach blends structured and unstructured search for more accurate and performant results—all abstracted behind a simple developer experience.

How does Harper handle updates or deletions in vector indexes?

Vector embeddings are recalculated automatically on record updates using Harper’s dynamic field feature. Deletions are cleanly removed from the index. Even bulk re-indexing is supported for cases like swapping embedding models or tuning algorithm parameters.

What’s new in Harper 4.6 logging and why does it matter?

4.6 introduces a revamped, developer-friendly logging system:

  • Full HTTP request tracing through the entire stack
  • Per-component configurability
  • Log shape customization
  • Live updates to logging settings without restarts

This is especially valuable for debugging complex workflows in production, without compromising system stability.

What’s the new Plugins API, and how is it different from extensions?

Plugins replace extensions as the go-forward abstraction for reusable logic. Unlike extensions, plugins:

  • Are dynamically loaded
  • Register via a single-handle method
  • Avoid being loaded by components that don’t need them
  • Simplify how components interface with shared functionality

Extensions are still supported for now, but will eventually be deprecated.

How does Harper address performance degradation over time?

Two key challenges affect long-term performance: massive data growth and rising user expectations. Harper addresses both with:

  • Horizontal scale and intelligent sharding
  • Real-time indexing optimizations
  • Component-level replication
  • A query planner that adapts based on data and workload

All while preserving the simplicity developers expect from document databases like MongoDB.

What’s the performance cost of enhanced logging?

Virtually none—unless you deliberately configure it that way. Logging in 4.6 is opt-in, per component, and can be fine-tuned for shape and frequency. Harper minimizes disk writes and avoids logging overload, ensuring observability without sacrificing performance.

What are common gotchas when migrating from multi-system architectures?

The most common friction comes from mindset shifts, not technical blockers. Harper consolidates multi-service stacks into a single, composable platform, eliminating synchronization issues and latency from chained services. The challenge is unlearning old patterns—once users build their first endpoint, the benefits quickly become obvious.

How important are GPUs in Harper's vector pipeline?

GPUs accelerate the generation of vector embeddings, especially at scale. But Harper supports CPU-based embedding, local testing, and token-based APIs (like OpenAI) as well. You choose the performance/cost tradeoff that fits your use case.

How does Harper handle conflict resolution in active-active replication?

Harper uses CRDTs with versioning and last-writer-wins logic to resolve write conflicts—though simultaneous microsecond-level updates on the same record are rare. This ensures data consistency across distributed nodes without interrupting performance.

Final Takeaway

Harper 4.6 represents a leap forward in developer experience, search capability, and system visibility. Whether you’re building AI-native search experiences or maintaining mission-critical APIs, this release gives you the tools to simplify your stack and scale with confidence.

Check out the docs and try the new features. Feedback? Questions? We’re on LinkedIn, X, Threads, BlueSky, and Slack. You can also contact us directly through our contact form

Harper 4.6 introduces native vector indexing (HNSW), developer-first logging, and a plugins API—boosting semantic search, debugging, and modularity, all with minimal performance impact. Ideal for AI-native apps and scaling with ease.

Download

White arrow pointing right
Harper 4.6 introduces native vector indexing (HNSW), developer-first logging, and a plugins API—boosting semantic search, debugging, and modularity, all with minimal performance impact. Ideal for AI-native apps and scaling with ease.

Download

White arrow pointing right
Harper 4.6 introduces native vector indexing (HNSW), developer-first logging, and a plugins API—boosting semantic search, debugging, and modularity, all with minimal performance impact. Ideal for AI-native apps and scaling with ease.

Download

White arrow pointing right

Explore Recent Resources

News
GitHub Logo

Harper 5.0 Is Here: Open Source, RocksDB, and a Runtime Built for the Agentic Era

Harper 5.0 launches with a fully open-source core under Apache 2.0, RocksDB as a native storage engine alongside LMDB, and source-available Harper Pro. This release delivers a unified runtime purpose-built for agentic engineering, from prototype to production.
Product Update
News
Harper 5.0 launches with a fully open-source core under Apache 2.0, RocksDB as a native storage engine alongside LMDB, and source-available Harper Pro. This release delivers a unified runtime purpose-built for agentic engineering, from prototype to production.
Person with short dark hair and moustache, wearing a colorful plaid shirt, smiling outdoors in a forested mountain landscape.
Aleks Haugom
Senior Manager of GTM & Marketing
News

Harper 5.0 Is Here: Open Source, RocksDB, and a Runtime Built for the Agentic Era

Harper 5.0 launches with a fully open-source core under Apache 2.0, RocksDB as a native storage engine alongside LMDB, and source-available Harper Pro. This release delivers a unified runtime purpose-built for agentic engineering, from prototype to production.
Aleks Haugom
Apr 2026
News

Harper 5.0 Is Here: Open Source, RocksDB, and a Runtime Built for the Agentic Era

Harper 5.0 launches with a fully open-source core under Apache 2.0, RocksDB as a native storage engine alongside LMDB, and source-available Harper Pro. This release delivers a unified runtime purpose-built for agentic engineering, from prototype to production.
Aleks Haugom
News

Harper 5.0 Is Here: Open Source, RocksDB, and a Runtime Built for the Agentic Era

Harper 5.0 launches with a fully open-source core under Apache 2.0, RocksDB as a native storage engine alongside LMDB, and source-available Harper Pro. This release delivers a unified runtime purpose-built for agentic engineering, from prototype to production.
Aleks Haugom
Podcast
GitHub Logo

Maintaining Momentum: Versioning, Stability & the Road to Nuxt 5 with Daniel Roe

In this podcast episode, Daniel Roe, lead of the Nuxt framework, shares insights on Nuxt 3, 4, and the upcoming Nuxt 5 release. We discuss open-source development, upgrading Nuxt apps, Vue-powered full-stack web apps, version maintenance, and the future of modern web development.
Select*
Podcast
In this podcast episode, Daniel Roe, lead of the Nuxt framework, shares insights on Nuxt 3, 4, and the upcoming Nuxt 5 release. We discuss open-source development, upgrading Nuxt apps, Vue-powered full-stack web apps, version maintenance, and the future of modern web development.
Person with short hair wearing a light blue patterned shirt, smiling widely outdoors with blurred greenery and trees in the background.
Austin Akers
Head of Developer Relations
Podcast

Maintaining Momentum: Versioning, Stability & the Road to Nuxt 5 with Daniel Roe

In this podcast episode, Daniel Roe, lead of the Nuxt framework, shares insights on Nuxt 3, 4, and the upcoming Nuxt 5 release. We discuss open-source development, upgrading Nuxt apps, Vue-powered full-stack web apps, version maintenance, and the future of modern web development.
Austin Akers
Apr 2026
Podcast

Maintaining Momentum: Versioning, Stability & the Road to Nuxt 5 with Daniel Roe

In this podcast episode, Daniel Roe, lead of the Nuxt framework, shares insights on Nuxt 3, 4, and the upcoming Nuxt 5 release. We discuss open-source development, upgrading Nuxt apps, Vue-powered full-stack web apps, version maintenance, and the future of modern web development.
Austin Akers
Podcast

Maintaining Momentum: Versioning, Stability & the Road to Nuxt 5 with Daniel Roe

In this podcast episode, Daniel Roe, lead of the Nuxt framework, shares insights on Nuxt 3, 4, and the upcoming Nuxt 5 release. We discuss open-source development, upgrading Nuxt apps, Vue-powered full-stack web apps, version maintenance, and the future of modern web development.
Austin Akers
Blog
GitHub Logo

Most LLM Calls Are Waste. Here's the Math.

Semantic caching for LLMs can reduce API costs by 20–70% by reusing similar responses. Combined with deterministic routing and improved retrieval, enterprises can significantly lower LLM usage, though effectiveness varies by workload and improves over time.
Blog
Semantic caching for LLMs can reduce API costs by 20–70% by reusing similar responses. Combined with deterministic routing and improved retrieval, enterprises can significantly lower LLM usage, though effectiveness varies by workload and improves over time.
Person with short dark hair and moustache, wearing a colorful plaid shirt, smiling outdoors in a forested mountain landscape.
Aleks Haugom
Senior Manager of GTM & Marketing
Blog

Most LLM Calls Are Waste. Here's the Math.

Semantic caching for LLMs can reduce API costs by 20–70% by reusing similar responses. Combined with deterministic routing and improved retrieval, enterprises can significantly lower LLM usage, though effectiveness varies by workload and improves over time.
Aleks Haugom
Apr 2026
Blog

Most LLM Calls Are Waste. Here's the Math.

Semantic caching for LLMs can reduce API costs by 20–70% by reusing similar responses. Combined with deterministic routing and improved retrieval, enterprises can significantly lower LLM usage, though effectiveness varies by workload and improves over time.
Aleks Haugom
Blog

Most LLM Calls Are Waste. Here's the Math.

Semantic caching for LLMs can reduce API costs by 20–70% by reusing similar responses. Combined with deterministic routing and improved retrieval, enterprises can significantly lower LLM usage, though effectiveness varies by workload and improves over time.
Aleks Haugom
Blog
GitHub Logo

Build a Conversational AI Agent on Harper in 5 Minutes

Build a conversational AI agent in minutes using Harper’s unified platform. This guide shows how to create, deploy, and scale real-time AI agents with built-in database, vector search, and APIs—eliminating infrastructure complexity for faster development.
Blog
Build a conversational AI agent in minutes using Harper’s unified platform. This guide shows how to create, deploy, and scale real-time AI agents with built-in database, vector search, and APIs—eliminating infrastructure complexity for faster development.
A smiling man with a beard and salt-and-pepper hair stands outdoors with arms crossed, wearing a white button-down shirt.
Stephen Goldberg
CEO & Co-Founder
Blog

Build a Conversational AI Agent on Harper in 5 Minutes

Build a conversational AI agent in minutes using Harper’s unified platform. This guide shows how to create, deploy, and scale real-time AI agents with built-in database, vector search, and APIs—eliminating infrastructure complexity for faster development.
Stephen Goldberg
Apr 2026
Blog

Build a Conversational AI Agent on Harper in 5 Minutes

Build a conversational AI agent in minutes using Harper’s unified platform. This guide shows how to create, deploy, and scale real-time AI agents with built-in database, vector search, and APIs—eliminating infrastructure complexity for faster development.
Stephen Goldberg
Blog

Build a Conversational AI Agent on Harper in 5 Minutes

Build a conversational AI agent in minutes using Harper’s unified platform. This guide shows how to create, deploy, and scale real-time AI agents with built-in database, vector search, and APIs—eliminating infrastructure complexity for faster development.
Stephen Goldberg
Podcast
GitHub Logo

Inside PixiJS, AT Protocol, and Modern Game Development with Trezy Who

Trezy shares his journey from professional drummer and filmmaker to software engineer and open source maintainer. Learn about PixieJS, game development, AT Proto, BlueSky, data sovereignty, and how developers can confidently contribute to open source projects.
Select*
Podcast
Trezy shares his journey from professional drummer and filmmaker to software engineer and open source maintainer. Learn about PixieJS, game development, AT Proto, BlueSky, data sovereignty, and how developers can confidently contribute to open source projects.
Person with short hair wearing a light blue patterned shirt, smiling widely outdoors with blurred greenery and trees in the background.
Austin Akers
Head of Developer Relations
Podcast

Inside PixiJS, AT Protocol, and Modern Game Development with Trezy Who

Trezy shares his journey from professional drummer and filmmaker to software engineer and open source maintainer. Learn about PixieJS, game development, AT Proto, BlueSky, data sovereignty, and how developers can confidently contribute to open source projects.
Austin Akers
Mar 2026
Podcast

Inside PixiJS, AT Protocol, and Modern Game Development with Trezy Who

Trezy shares his journey from professional drummer and filmmaker to software engineer and open source maintainer. Learn about PixieJS, game development, AT Proto, BlueSky, data sovereignty, and how developers can confidently contribute to open source projects.
Austin Akers
Podcast

Inside PixiJS, AT Protocol, and Modern Game Development with Trezy Who

Trezy shares his journey from professional drummer and filmmaker to software engineer and open source maintainer. Learn about PixieJS, game development, AT Proto, BlueSky, data sovereignty, and how developers can confidently contribute to open source projects.
Austin Akers