Click Below to Get the Code

Browse, clone, and build from real-world templates powered by Harper.
Tutorial
GitHub Logo

Your API cache is secretly a database

Most teams treat a cache as a black box: URL-keyed blobs with a TTL, useful for speed and nothing else. In Harper, cached data lands in a real table inside the same query engine. That means filtering, joining, real-time subscriptions, and vector search all work against it.
Cache
Tutorial
Cache

Your API cache is secretly a database

Kris Zyp
SVP of Engineering
at Harper
June 3, 2026
Kris Zyp
SVP of Engineering
at Harper
June 3, 2026
Kris Zyp
SVP of Engineering
at Harper
June 3, 2026
June 3, 2026
Most teams treat a cache as a black box: URL-keyed blobs with a TTL, useful for speed and nothing else. In Harper, cached data lands in a real table inside the same query engine. That means filtering, joining, real-time subscriptions, and vector search all work against it.
Kris Zyp
SVP of Engineering

Caching is usually the first thing a team reaches for when an API gets slow, and the last thing they think about afterward. You put a cache in front of an endpoint, responses get faster, and the work is done. The cache becomes a black box: opaque response bodies keyed by URL, with a TTL bolted on, sitting in front of the origin. It does exactly one thing.

That framing is what limits caching to a narrow role. The moment you need to do anything with the cached data, such as filter it, reshape it for a client, join it to something else, or push updates to a browser, the cache has nothing to offer. You go back to the origin, or you stand up another system. So caching gets treated as a tactical speed fix rather than what it can actually be.

In Harper the cache is a table. Not a sidecar in front of a table. The cached data lands in a real table, in the same process as the query engine. Once the data is there, everything Harper can do to a table, it can do to your cache. That changes what a cache is good for.

The rest of this post walks that progression on a single cached API, from a drop-in read-through cache to a queryable, relational, real-time, semantically-searchable data layer. The upstream is a public products API, and none of the steps touch it. There's a complete, tested example to follow along with in the caching template.

Start with an ordinary cache

A cache table is a table with an expiration and a source. The source is just a resource that knows how to fetch a record:

class ProductAPI extends Resource {
  async get() {
    const response = await fetch(`https://dummyjson.com/products/${this.getId()}`);
    return response.json();
  }
}
tables.ProductCache.sourcedFrom(ProductAPI);

type ProductCache @table(expiration: 3600) {
  id: Int @primaryKey
  title: String
  price: Float
  # ...the fields you care about
}

This is the part everyone already does. A read for an uncached id triggers one upstream fetch and stores the result; every read after that is served locally until the TTL expires. Concurrent misses for the same id collapse into a single upstream request, so a cold key under load doesn't stampede the origin. So far, nothing unusual. This is a cache.

The difference is where the data ended up. It's in a table, with a schema, in the database.

Query the data you cached

Because the cache is a table, it answers queries the origin API may never have supported. Mark the fields you filter and sort on as indexed, and Harper's REST query language works directly against the cached records:

GET /ProductCache/?price=le=50&sort(-rating)&limit(10)&select(id,title,price,rating)

Filters, sorting, pagination, field projection, and boolean grouping, all over data that arrived as plain JSON from an endpoint that offered none of it. You didn't build a query layer; you cached some data and got one.

There is a caveat worth stating plainly. Read-through caching populates one record per request, so a query only sees what's been cached so far. To make the whole collection queryable, you warm it once from the upstream's list endpoint. That's a deliberate step rather than magic, but it's a few lines, and from then on the cache behaves like any other table.

Reshape it at the edge

Clients rarely want data in exactly the shape the origin returns. Normally that reshaping lives in a backend-for-frontend service you write and operate. In Harper it's a computed field, resolved on read from the record's own attributes:

salePrice: Float @computed(from: "price - price * (discountPercentage || 0) / 100")
inStock: Boolean @computed(from: "stock > 0")

No extra storage, no upstream change, and the computed fields are queryable and selectable like any other. That's a BFF in two lines. Heavier reshaping (renaming, nesting, merging several sources into one response) means extending the table resource and overriding get(), but the simple cases don't need it.

Relate two cached APIs

Cache a second resource and you can join them. Define a relationship by the field that links them:

type ProductCache @table(expiration: 3600) {
  categoryInfo: CategoryCache @relationship(from: "category")
}

Now you can embed the related record or filter by its fields:

GET /ProductCache/1?select(title,categoryInfo{slug,name})
GET /ProductCache/?categoryInfo.slug=beauty&limit(5)

Harper resolves the relationship locally, with no N+1 round-trips back to the origin to assemble a joined view. Two independently-cached APIs become one normalized, low-latency read model. (Related fields are selected with braces; parentheses are reserved for query functions like select and sort.)

Stream it

An exported table publishes a WebSocket/SSE endpoint at the same path as its REST endpoint, so clients can subscribe to a record and react to every change with no additional code. Pair that with push invalidation, so the cache reflects upstream changes the moment you learn about them rather than when a timer happens to expire:

// drop a cached entry on demand; the next read refetches from source
static async post(target, data) {
  if ((await data)?.action === 'invalidate') {
    await this.invalidate(target);
    return { status: 'invalidated' };
  }
}

const ws = new WebSocket('ws://localhost:9926/ProductCache/5');
ws.onmessage = (e) => console.log('product 5 changed:', JSON.parse(e.data));

REST for reads and the same model streamed live over WebSocket, MQTT, or SSE, from one resource definition. Combining those interfaces over shared data is something very few systems offer without gluing several products together.

Search it by meaning

The same cached records can be searched semantically. Add a vector index, embed each record's text on the way in, and query with a natural-language prompt:

embedding: [Float] @indexed(type: "HNSW", distance: "cosine")

return tables.ProductCache.search({
  select: ['id', 'title', 'price', '$distance'],
  sort: { attribute: 'embedding', target: await embed(query) },
  limit: 5,
});

A query for "something to keep my drink cold" returns tumblers and coolers, ranked by distance, with no keyword overlap required and no separate vector database to operate alongside the cache. Generating the embeddings does currently reach outside the database: you point embed() at an external provider (OpenAI, a local Ollama model, or similar) to turn text into vectors, and Harper stores and searches them. That last gap is closing. Built-in embedding generation on Fabric is coming (harper#510), so the embedding step will move in-process alongside the storage and search.

Why this matters

Teams adopt caching because it's easy to justify and easy to insert: minimally invasive, immediate performance, low risk. That's a good way to get in the door. The usual problem is that a plain cache stops there. It's a dead end with no path to anything more valuable.

A cache that's a table doesn't dead-end. The structured data you cached for speed is already queryable, already reshapeable, already joinable, already streamable, already searchable. Each step above was a few lines, and none of them required touching the origin or adopting a new system. The easy "land" of an API cache and the deeper "expand" of a real data layer turn out to be the same thing. You just keep using more of what's already there.

That's the part I think is underappreciated, including by us at times. The cache you added to make an endpoint faster is, sitting right there, the foundation for the next several features. Caveat emptor on the details: warm-up, write-through, embedding providers, and eventual freshness all need a decision. But the shape holds. A working version of everything here is in the caching template if you want to start from something that runs.

Caching is usually the first thing a team reaches for when an API gets slow, and the last thing they think about afterward. You put a cache in front of an endpoint, responses get faster, and the work is done. The cache becomes a black box: opaque response bodies keyed by URL, with a TTL bolted on, sitting in front of the origin. It does exactly one thing.

That framing is what limits caching to a narrow role. The moment you need to do anything with the cached data, such as filter it, reshape it for a client, join it to something else, or push updates to a browser, the cache has nothing to offer. You go back to the origin, or you stand up another system. So caching gets treated as a tactical speed fix rather than what it can actually be.

In Harper the cache is a table. Not a sidecar in front of a table. The cached data lands in a real table, in the same process as the query engine. Once the data is there, everything Harper can do to a table, it can do to your cache. That changes what a cache is good for.

The rest of this post walks that progression on a single cached API, from a drop-in read-through cache to a queryable, relational, real-time, semantically-searchable data layer. The upstream is a public products API, and none of the steps touch it. There's a complete, tested example to follow along with in the caching template.

Start with an ordinary cache

A cache table is a table with an expiration and a source. The source is just a resource that knows how to fetch a record:

class ProductAPI extends Resource {
  async get() {
    const response = await fetch(`https://dummyjson.com/products/${this.getId()}`);
    return response.json();
  }
}
tables.ProductCache.sourcedFrom(ProductAPI);

type ProductCache @table(expiration: 3600) {
  id: Int @primaryKey
  title: String
  price: Float
  # ...the fields you care about
}

This is the part everyone already does. A read for an uncached id triggers one upstream fetch and stores the result; every read after that is served locally until the TTL expires. Concurrent misses for the same id collapse into a single upstream request, so a cold key under load doesn't stampede the origin. So far, nothing unusual. This is a cache.

The difference is where the data ended up. It's in a table, with a schema, in the database.

Query the data you cached

Because the cache is a table, it answers queries the origin API may never have supported. Mark the fields you filter and sort on as indexed, and Harper's REST query language works directly against the cached records:

GET /ProductCache/?price=le=50&sort(-rating)&limit(10)&select(id,title,price,rating)

Filters, sorting, pagination, field projection, and boolean grouping, all over data that arrived as plain JSON from an endpoint that offered none of it. You didn't build a query layer; you cached some data and got one.

There is a caveat worth stating plainly. Read-through caching populates one record per request, so a query only sees what's been cached so far. To make the whole collection queryable, you warm it once from the upstream's list endpoint. That's a deliberate step rather than magic, but it's a few lines, and from then on the cache behaves like any other table.

Reshape it at the edge

Clients rarely want data in exactly the shape the origin returns. Normally that reshaping lives in a backend-for-frontend service you write and operate. In Harper it's a computed field, resolved on read from the record's own attributes:

salePrice: Float @computed(from: "price - price * (discountPercentage || 0) / 100")
inStock: Boolean @computed(from: "stock > 0")

No extra storage, no upstream change, and the computed fields are queryable and selectable like any other. That's a BFF in two lines. Heavier reshaping (renaming, nesting, merging several sources into one response) means extending the table resource and overriding get(), but the simple cases don't need it.

Relate two cached APIs

Cache a second resource and you can join them. Define a relationship by the field that links them:

type ProductCache @table(expiration: 3600) {
  categoryInfo: CategoryCache @relationship(from: "category")
}

Now you can embed the related record or filter by its fields:

GET /ProductCache/1?select(title,categoryInfo{slug,name})
GET /ProductCache/?categoryInfo.slug=beauty&limit(5)

Harper resolves the relationship locally, with no N+1 round-trips back to the origin to assemble a joined view. Two independently-cached APIs become one normalized, low-latency read model. (Related fields are selected with braces; parentheses are reserved for query functions like select and sort.)

Stream it

An exported table publishes a WebSocket/SSE endpoint at the same path as its REST endpoint, so clients can subscribe to a record and react to every change with no additional code. Pair that with push invalidation, so the cache reflects upstream changes the moment you learn about them rather than when a timer happens to expire:

// drop a cached entry on demand; the next read refetches from source
static async post(target, data) {
  if ((await data)?.action === 'invalidate') {
    await this.invalidate(target);
    return { status: 'invalidated' };
  }
}

const ws = new WebSocket('ws://localhost:9926/ProductCache/5');
ws.onmessage = (e) => console.log('product 5 changed:', JSON.parse(e.data));

REST for reads and the same model streamed live over WebSocket, MQTT, or SSE, from one resource definition. Combining those interfaces over shared data is something very few systems offer without gluing several products together.

Search it by meaning

The same cached records can be searched semantically. Add a vector index, embed each record's text on the way in, and query with a natural-language prompt:

embedding: [Float] @indexed(type: "HNSW", distance: "cosine")

return tables.ProductCache.search({
  select: ['id', 'title', 'price', '$distance'],
  sort: { attribute: 'embedding', target: await embed(query) },
  limit: 5,
});

A query for "something to keep my drink cold" returns tumblers and coolers, ranked by distance, with no keyword overlap required and no separate vector database to operate alongside the cache. Generating the embeddings does currently reach outside the database: you point embed() at an external provider (OpenAI, a local Ollama model, or similar) to turn text into vectors, and Harper stores and searches them. That last gap is closing. Built-in embedding generation on Fabric is coming (harper#510), so the embedding step will move in-process alongside the storage and search.

Why this matters

Teams adopt caching because it's easy to justify and easy to insert: minimally invasive, immediate performance, low risk. That's a good way to get in the door. The usual problem is that a plain cache stops there. It's a dead end with no path to anything more valuable.

A cache that's a table doesn't dead-end. The structured data you cached for speed is already queryable, already reshapeable, already joinable, already streamable, already searchable. Each step above was a few lines, and none of them required touching the origin or adopting a new system. The easy "land" of an API cache and the deeper "expand" of a real data layer turn out to be the same thing. You just keep using more of what's already there.

That's the part I think is underappreciated, including by us at times. The cache you added to make an endpoint faster is, sitting right there, the foundation for the next several features. Caveat emptor on the details: warm-up, write-through, embedding providers, and eventual freshness all need a decision. But the shape holds. A working version of everything here is in the caching template if you want to start from something that runs.

Most teams treat a cache as a black box: URL-keyed blobs with a TTL, useful for speed and nothing else. In Harper, cached data lands in a real table inside the same query engine. That means filtering, joining, real-time subscriptions, and vector search all work against it.

Download

White arrow pointing right
Most teams treat a cache as a black box: URL-keyed blobs with a TTL, useful for speed and nothing else. In Harper, cached data lands in a real table inside the same query engine. That means filtering, joining, real-time subscriptions, and vector search all work against it.

Download

White arrow pointing right
Most teams treat a cache as a black box: URL-keyed blobs with a TTL, useful for speed and nothing else. In Harper, cached data lands in a real table inside the same query engine. That means filtering, joining, real-time subscriptions, and vector search all work against it.

Download

White arrow pointing right

Explore Recent Resources

Tutorial
GitHub Logo

Your API cache is secretly a database

Most teams treat a cache as a black box: URL-keyed blobs with a TTL, useful for speed and nothing else. In Harper, cached data lands in a real table inside the same query engine. That means filtering, joining, real-time subscriptions, and vector search all work against it.
Cache
Tutorial
Most teams treat a cache as a black box: URL-keyed blobs with a TTL, useful for speed and nothing else. In Harper, cached data lands in a real table inside the same query engine. That means filtering, joining, real-time subscriptions, and vector search all work against it.
Person with very short blonde hair wearing a light gray button‑up shirt, standing with arms crossed and smiling outdoors with foliage behind.
Kris Zyp
SVP of Engineering
Tutorial

Your API cache is secretly a database

Most teams treat a cache as a black box: URL-keyed blobs with a TTL, useful for speed and nothing else. In Harper, cached data lands in a real table inside the same query engine. That means filtering, joining, real-time subscriptions, and vector search all work against it.
Kris Zyp
Jun 2026
Tutorial

Your API cache is secretly a database

Most teams treat a cache as a black box: URL-keyed blobs with a TTL, useful for speed and nothing else. In Harper, cached data lands in a real table inside the same query engine. That means filtering, joining, real-time subscriptions, and vector search all work against it.
Kris Zyp
Tutorial

Your API cache is secretly a database

Most teams treat a cache as a black box: URL-keyed blobs with a TTL, useful for speed and nothing else. In Harper, cached data lands in a real table inside the same query engine. That means filtering, joining, real-time subscriptions, and vector search all work against it.
Kris Zyp
Tutorial
GitHub Logo

Introducing Structon: Random-Access Binary Encoding for JavaScript

Deserializing entire records to read one field is a bottleneck at scale. Structon stores objects in a binary format where any field is reachable by byte offset, with lazy getters that never allocate until you access a property. It's the encoding Harper has used internally for years, now a standalone package.
JavaScript
Tutorial
Deserializing entire records to read one field is a bottleneck at scale. Structon stores objects in a binary format where any field is reachable by byte offset, with lazy getters that never allocate until you access a property. It's the encoding Harper has used internally for years, now a standalone package.
Person with very short blonde hair wearing a light gray button‑up shirt, standing with arms crossed and smiling outdoors with foliage behind.
Kris Zyp
SVP of Engineering
Tutorial

Introducing Structon: Random-Access Binary Encoding for JavaScript

Deserializing entire records to read one field is a bottleneck at scale. Structon stores objects in a binary format where any field is reachable by byte offset, with lazy getters that never allocate until you access a property. It's the encoding Harper has used internally for years, now a standalone package.
Kris Zyp
Jun 2026
Tutorial

Introducing Structon: Random-Access Binary Encoding for JavaScript

Deserializing entire records to read one field is a bottleneck at scale. Structon stores objects in a binary format where any field is reachable by byte offset, with lazy getters that never allocate until you access a property. It's the encoding Harper has used internally for years, now a standalone package.
Kris Zyp
Tutorial

Introducing Structon: Random-Access Binary Encoding for JavaScript

Deserializing entire records to read one field is a bottleneck at scale. Structon stores objects in a binary format where any field is reachable by byte offset, with lazy getters that never allocate until you access a property. It's the encoding Harper has used internally for years, now a standalone package.
Kris Zyp
Livestream
GitHub Logo

1.5 Hour Build - Vibe Coding a Full Personal Site: Design to Deployment in One Session

Watch Austin rebuild his personal website live using Claude AI and Harper, including a custom Markdown CMS, GraphQL schema design, React scaffolding, and full deployment. A real-time pair coding session from design to launch.
Livestream
Watch Austin rebuild his personal website live using Claude AI and Harper, including a custom Markdown CMS, GraphQL schema design, React scaffolding, and full deployment. A real-time pair coding session from design to launch.
Person with short hair wearing a light blue patterned shirt, smiling widely outdoors with blurred greenery and trees in the background.
Austin Akers
Head of Developer Relations
Livestream

1.5 Hour Build - Vibe Coding a Full Personal Site: Design to Deployment in One Session

Watch Austin rebuild his personal website live using Claude AI and Harper, including a custom Markdown CMS, GraphQL schema design, React scaffolding, and full deployment. A real-time pair coding session from design to launch.
Austin Akers
May 2026
Livestream

1.5 Hour Build - Vibe Coding a Full Personal Site: Design to Deployment in One Session

Watch Austin rebuild his personal website live using Claude AI and Harper, including a custom Markdown CMS, GraphQL schema design, React scaffolding, and full deployment. A real-time pair coding session from design to launch.
Austin Akers
Livestream

1.5 Hour Build - Vibe Coding a Full Personal Site: Design to Deployment in One Session

Watch Austin rebuild his personal website live using Claude AI and Harper, including a custom Markdown CMS, GraphQL schema design, React scaffolding, and full deployment. A real-time pair coding session from design to launch.
Austin Akers
Blog
GitHub Logo

The Old Product Loop Is the New Bottleneck

AI has made it dramatically cheaper to get software to a working version, but most companies still plan like building is the expensive part. The new bottleneck is the product loop: forming sharp hypotheses, living inside the user experience, fixing friction as it appears, and feeding evidence back into the roadmap faster than ticket-based planning allows.
Blog
AI has made it dramatically cheaper to get software to a working version, but most companies still plan like building is the expensive part. The new bottleneck is the product loop: forming sharp hypotheses, living inside the user experience, fixing friction as it appears, and feeding evidence back into the roadmap faster than ticket-based planning allows.
Person with short dark hair and moustache, wearing a colorful plaid shirt, smiling outdoors in a forested mountain landscape.
Aleks Haugom
Senior Manager of GTM
Blog

The Old Product Loop Is the New Bottleneck

AI has made it dramatically cheaper to get software to a working version, but most companies still plan like building is the expensive part. The new bottleneck is the product loop: forming sharp hypotheses, living inside the user experience, fixing friction as it appears, and feeding evidence back into the roadmap faster than ticket-based planning allows.
Aleks Haugom
May 2026
Blog

The Old Product Loop Is the New Bottleneck

AI has made it dramatically cheaper to get software to a working version, but most companies still plan like building is the expensive part. The new bottleneck is the product loop: forming sharp hypotheses, living inside the user experience, fixing friction as it appears, and feeding evidence back into the roadmap faster than ticket-based planning allows.
Aleks Haugom
Blog

The Old Product Loop Is the New Bottleneck

AI has made it dramatically cheaper to get software to a working version, but most companies still plan like building is the expensive part. The new bottleneck is the product loop: forming sharp hypotheses, living inside the user experience, fixing friction as it appears, and feeding evidence back into the roadmap faster than ticket-based planning allows.
Aleks Haugom