Click Below to Get the Code

Browse, clone, and build from real-world templates powered by Harper.
Tutorial
GitHub Logo

Introducing Structon: Random-Access Binary Encoding for JavaScript

Deserializing entire records to read one field is a bottleneck at scale. Structon stores objects in a binary format where any field is reachable by byte offset, with lazy getters that never allocate until you access a property. It's the encoding Harper has used internally for years, now a standalone package.
JavaScript
Tutorial
JavaScript

Introducing Structon: Random-Access Binary Encoding for JavaScript

Kris Zyp
SVP of Engineering
at Harper
June 1, 2026
Kris Zyp
SVP of Engineering
at Harper
June 1, 2026
Kris Zyp
SVP of Engineering
at Harper
June 1, 2026
June 1, 2026
Deserializing entire records to read one field is a bottleneck at scale. Structon stores objects in a binary format where any field is reachable by byte offset, with lazy getters that never allocate until you access a property. It's the encoding Harper has used internally for years, now a standalone package.
Kris Zyp
SVP of Engineering

When Harper evaluates query conditions, it frequently needs to access individual fields from potentially large numbers of stored records. For single-condition queries on an indexed field, Harper can scan the secondary index directly and never touch most records. But for multi-condition AND queries—say, age > 25 AND city == 'Denver'—only the first condition can be satisfied by an index scan; every candidate record still needs the city field checked. And for unindexed fields, there's no choice but to read records directly. Either way, the question becomes: how cheaply can you read one field from a stored record?

The naïve approach—deserialize the entire record into a JavaScript object, read the property, discard the object—works fine at small scale. At large scale it becomes the bottleneck. Every allocation, every field you decode but don't need, every GC pressure event adds up.

Random-access encoding is the solution: store records in a binary format where any individual field can be read by jumping directly to its byte offset, without touching the rest of the record. We've been using this approach inside Harper for years, as a somewhat hidden capability within msgpackr. With the release of structon, we're making it a standalone, reusable package.

How It Works

The core idea is that objects with the same set of keys—the same "shape"—share a structure definition that describes the byte layout. The encoder builds up these structure definitions incrementally as it sees new shapes. Once a shape is known, every subsequent record of that shape writes only values, not field names. The decoder can identify the structure from a short header in the bytes, look up the field positions, and provide lazy getter properties that read directly from the raw buffer on demand.

The binary layout for a struct record looks like:

[header: 1–4 bytes][fixed-width field section][variable ref section]

The fixed-width section has one slot per field:

  • Numbers get 1, 4, or 8 bytes depending on type
  • Dates get 8 bytes (stored as a float64)
  • Strings and nested objects store a small integer offset/length pair pointing into the ref section

The ref section follows: first the raw bytes of all strings, then the encoded bytes of any nested objects.

When you access a property on the decoded object, the getter reads from those fixed offsets in the original buffer. You're not allocating a new string unless you actually access the property. You're not decoding a nested object unless you access it. For a query checking record.city == 'Denver', that's a short string read from the buffer—nothing else.

Why Not JSON, MessagePack, or BSON?

JSON requires full text parsing to reach any field. There's no notion of byte offsets; every character must be scanned. Random access to a field deep in a large JSON object isn't possible without parsing everything before it.

Plain MessagePack encodes fields as key-value pairs, sequentially. To read a specific field, you have to walk through the byte stream until you find it. This is faster than JSON text parsing, but it's still sequential—no fixed offsets, no direct jumps.

BSON (MongoDB's format) does enable random-ish access via a length-prefix-then-skip traversal, but it stores field names inside every document. For a table of 100,000 records that all have the fields { id, name, age, email, createdAt }, BSON repeats those five field name strings in every single stored document. That's a lot of redundant bytes, both on disk and in memory, and you still have to traverse them sequentially to find what you want.

Structon shares structure definitions across all records with the same shape. The field names are stored once; individual records store only values, laid out at fixed offsets. The result is records that are more compact than BSON and faster to read than anything requiring sequential traversal.

Standalone Package, Previously Hidden

This encoding has been part of msgpackr since early on, as the struct.js module with randomAccessStructure: true option. It wasn't designed as a standalone API—it registered global hooks into msgpackr's encode/decode path and wasn't intended for independent use.

With msgpackr 2.0, we've removed struct.js from msgpackr entirely. structon is now where this functionality lives. It's also more general: instead of being coupled to msgpackr's internals, structon provides a createStructon(BaseClass) factory that wraps any compatible encoder class—msgpackr's Packr or cbor-x's Encoder—and adds struct encoding on top.

Usage

Install the package:

npm install structon

Then wrap your base encoder:

import { Packr } from 'msgpackr';
import { createStructon } from 'structon';

const Structon = createStructon(Packr);

// Pass structures: [] to enable shared structure persistence
const codec = new Structon({ structures: [] });

const encoded = codec.encode({ name: 'Alice', age: 30, score: 98.6 });
const record = codec.decode(encoded);

// Property access is lazy — reads directly from the binary buffer
console.log(record.age);   // 30
console.log(record.name);  // 'Alice'

The same factory works with cbor-x:

import { Encoder } from 'cbor-x';
import { createStructon } from 'structon';

const Structon = createStructon(Encoder);
const codec = new Structon({ structures: [] });

If you were previously using msgpackr with randomAccessStructure: true, migrating to structon is the path forward with msgpackr 2.0.

Persistence

Structure definitions need to persist across process restarts to remain useful—otherwise every new process starts with no known shapes and can't read previously encoded data. structon integrates with msgpackr's and cbor-x's standard persistence hooks (getStructures/saveStructures for msgpackr, getShared/saveShared for cbor-x). How you persist them—RocksDB, LMDB, a local file—is up to you.

In Harper, structure definitions are stored in table metadata in the database. When a record with a new shape is first encoded, Harper saves that structure definition so all future records with the same shape can reuse it without any overhead. This means the encoding path is effectively the same cost as plain MessagePack after the first encounter of a given shape. And you can update record structures/schemas and use hetorogenous record shapes without impacting existing data (each record retains a reference to the structure it needs), making it easy to migrate to new structures. If you are directly using structon or msgpackr, you would need to track/persist these structures. Tracking stored structures can be complicated, especially intertwined with transactions and concurrency, but Harper handles this for you.

(The only scenario where random-access struct encoding is likely a net negative is highly dynamic data where nearly every record has a different set of property names—which is generally a sign the data would be better modeled as a Map anyway.)

Performance

The performance advantage shows up most clearly in filtered query workloads: multi-condition queries that need to check one or two fields from each candidate record, and full scans of large tables where most fields are never needed. With random-access struct encoding, those access patterns skip deserialization of everything they don't read. The GC pressure from allocating intermediate objects for fields that are immediately discarded disappears.

What's Next

Structon is published on npm. The source is at github.com/HarperFast/structon. The binary format is identical to what msgpackr's struct.js produced, so data encoded with earlier versions of Harper or msgpackr with randomAccessStructure: true is fully readable by structon—and vice versa.

When Harper evaluates query conditions, it frequently needs to access individual fields from potentially large numbers of stored records. For single-condition queries on an indexed field, Harper can scan the secondary index directly and never touch most records. But for multi-condition AND queries—say, age > 25 AND city == 'Denver'—only the first condition can be satisfied by an index scan; every candidate record still needs the city field checked. And for unindexed fields, there's no choice but to read records directly. Either way, the question becomes: how cheaply can you read one field from a stored record?

The naïve approach—deserialize the entire record into a JavaScript object, read the property, discard the object—works fine at small scale. At large scale it becomes the bottleneck. Every allocation, every field you decode but don't need, every GC pressure event adds up.

Random-access encoding is the solution: store records in a binary format where any individual field can be read by jumping directly to its byte offset, without touching the rest of the record. We've been using this approach inside Harper for years, as a somewhat hidden capability within msgpackr. With the release of structon, we're making it a standalone, reusable package.

How It Works

The core idea is that objects with the same set of keys—the same "shape"—share a structure definition that describes the byte layout. The encoder builds up these structure definitions incrementally as it sees new shapes. Once a shape is known, every subsequent record of that shape writes only values, not field names. The decoder can identify the structure from a short header in the bytes, look up the field positions, and provide lazy getter properties that read directly from the raw buffer on demand.

The binary layout for a struct record looks like:

[header: 1–4 bytes][fixed-width field section][variable ref section]

The fixed-width section has one slot per field:

  • Numbers get 1, 4, or 8 bytes depending on type
  • Dates get 8 bytes (stored as a float64)
  • Strings and nested objects store a small integer offset/length pair pointing into the ref section

The ref section follows: first the raw bytes of all strings, then the encoded bytes of any nested objects.

When you access a property on the decoded object, the getter reads from those fixed offsets in the original buffer. You're not allocating a new string unless you actually access the property. You're not decoding a nested object unless you access it. For a query checking record.city == 'Denver', that's a short string read from the buffer—nothing else.

Why Not JSON, MessagePack, or BSON?

JSON requires full text parsing to reach any field. There's no notion of byte offsets; every character must be scanned. Random access to a field deep in a large JSON object isn't possible without parsing everything before it.

Plain MessagePack encodes fields as key-value pairs, sequentially. To read a specific field, you have to walk through the byte stream until you find it. This is faster than JSON text parsing, but it's still sequential—no fixed offsets, no direct jumps.

BSON (MongoDB's format) does enable random-ish access via a length-prefix-then-skip traversal, but it stores field names inside every document. For a table of 100,000 records that all have the fields { id, name, age, email, createdAt }, BSON repeats those five field name strings in every single stored document. That's a lot of redundant bytes, both on disk and in memory, and you still have to traverse them sequentially to find what you want.

Structon shares structure definitions across all records with the same shape. The field names are stored once; individual records store only values, laid out at fixed offsets. The result is records that are more compact than BSON and faster to read than anything requiring sequential traversal.

Standalone Package, Previously Hidden

This encoding has been part of msgpackr since early on, as the struct.js module with randomAccessStructure: true option. It wasn't designed as a standalone API—it registered global hooks into msgpackr's encode/decode path and wasn't intended for independent use.

With msgpackr 2.0, we've removed struct.js from msgpackr entirely. structon is now where this functionality lives. It's also more general: instead of being coupled to msgpackr's internals, structon provides a createStructon(BaseClass) factory that wraps any compatible encoder class—msgpackr's Packr or cbor-x's Encoder—and adds struct encoding on top.

Usage

Install the package:

npm install structon

Then wrap your base encoder:

import { Packr } from 'msgpackr';
import { createStructon } from 'structon';

const Structon = createStructon(Packr);

// Pass structures: [] to enable shared structure persistence
const codec = new Structon({ structures: [] });

const encoded = codec.encode({ name: 'Alice', age: 30, score: 98.6 });
const record = codec.decode(encoded);

// Property access is lazy — reads directly from the binary buffer
console.log(record.age);   // 30
console.log(record.name);  // 'Alice'

The same factory works with cbor-x:

import { Encoder } from 'cbor-x';
import { createStructon } from 'structon';

const Structon = createStructon(Encoder);
const codec = new Structon({ structures: [] });

If you were previously using msgpackr with randomAccessStructure: true, migrating to structon is the path forward with msgpackr 2.0.

Persistence

Structure definitions need to persist across process restarts to remain useful—otherwise every new process starts with no known shapes and can't read previously encoded data. structon integrates with msgpackr's and cbor-x's standard persistence hooks (getStructures/saveStructures for msgpackr, getShared/saveShared for cbor-x). How you persist them—RocksDB, LMDB, a local file—is up to you.

In Harper, structure definitions are stored in table metadata in the database. When a record with a new shape is first encoded, Harper saves that structure definition so all future records with the same shape can reuse it without any overhead. This means the encoding path is effectively the same cost as plain MessagePack after the first encounter of a given shape. And you can update record structures/schemas and use hetorogenous record shapes without impacting existing data (each record retains a reference to the structure it needs), making it easy to migrate to new structures. If you are directly using structon or msgpackr, you would need to track/persist these structures. Tracking stored structures can be complicated, especially intertwined with transactions and concurrency, but Harper handles this for you.

(The only scenario where random-access struct encoding is likely a net negative is highly dynamic data where nearly every record has a different set of property names—which is generally a sign the data would be better modeled as a Map anyway.)

Performance

The performance advantage shows up most clearly in filtered query workloads: multi-condition queries that need to check one or two fields from each candidate record, and full scans of large tables where most fields are never needed. With random-access struct encoding, those access patterns skip deserialization of everything they don't read. The GC pressure from allocating intermediate objects for fields that are immediately discarded disappears.

What's Next

Structon is published on npm. The source is at github.com/HarperFast/structon. The binary format is identical to what msgpackr's struct.js produced, so data encoded with earlier versions of Harper or msgpackr with randomAccessStructure: true is fully readable by structon—and vice versa.

Deserializing entire records to read one field is a bottleneck at scale. Structon stores objects in a binary format where any field is reachable by byte offset, with lazy getters that never allocate until you access a property. It's the encoding Harper has used internally for years, now a standalone package.

Download

White arrow pointing right
Deserializing entire records to read one field is a bottleneck at scale. Structon stores objects in a binary format where any field is reachable by byte offset, with lazy getters that never allocate until you access a property. It's the encoding Harper has used internally for years, now a standalone package.

Download

White arrow pointing right
Deserializing entire records to read one field is a bottleneck at scale. Structon stores objects in a binary format where any field is reachable by byte offset, with lazy getters that never allocate until you access a property. It's the encoding Harper has used internally for years, now a standalone package.

Download

White arrow pointing right

Explore Recent Resources

Tutorial
GitHub Logo

Introducing Structon: Random-Access Binary Encoding for JavaScript

Deserializing entire records to read one field is a bottleneck at scale. Structon stores objects in a binary format where any field is reachable by byte offset, with lazy getters that never allocate until you access a property. It's the encoding Harper has used internally for years, now a standalone package.
JavaScript
Tutorial
Deserializing entire records to read one field is a bottleneck at scale. Structon stores objects in a binary format where any field is reachable by byte offset, with lazy getters that never allocate until you access a property. It's the encoding Harper has used internally for years, now a standalone package.
Person with very short blonde hair wearing a light gray button‑up shirt, standing with arms crossed and smiling outdoors with foliage behind.
Kris Zyp
SVP of Engineering
Tutorial

Introducing Structon: Random-Access Binary Encoding for JavaScript

Deserializing entire records to read one field is a bottleneck at scale. Structon stores objects in a binary format where any field is reachable by byte offset, with lazy getters that never allocate until you access a property. It's the encoding Harper has used internally for years, now a standalone package.
Kris Zyp
Jun 2026
Tutorial

Introducing Structon: Random-Access Binary Encoding for JavaScript

Deserializing entire records to read one field is a bottleneck at scale. Structon stores objects in a binary format where any field is reachable by byte offset, with lazy getters that never allocate until you access a property. It's the encoding Harper has used internally for years, now a standalone package.
Kris Zyp
Tutorial

Introducing Structon: Random-Access Binary Encoding for JavaScript

Deserializing entire records to read one field is a bottleneck at scale. Structon stores objects in a binary format where any field is reachable by byte offset, with lazy getters that never allocate until you access a property. It's the encoding Harper has used internally for years, now a standalone package.
Kris Zyp
Livestream
GitHub Logo

1.5 Hour Build - Vibe Coding a Full Personal Site: Design to Deployment in One Session

Watch Austin rebuild his personal website live using Claude AI and Harper, including a custom Markdown CMS, GraphQL schema design, React scaffolding, and full deployment. A real-time pair coding session from design to launch.
Livestream
Watch Austin rebuild his personal website live using Claude AI and Harper, including a custom Markdown CMS, GraphQL schema design, React scaffolding, and full deployment. A real-time pair coding session from design to launch.
Person with short hair wearing a light blue patterned shirt, smiling widely outdoors with blurred greenery and trees in the background.
Austin Akers
Head of Developer Relations
Livestream

1.5 Hour Build - Vibe Coding a Full Personal Site: Design to Deployment in One Session

Watch Austin rebuild his personal website live using Claude AI and Harper, including a custom Markdown CMS, GraphQL schema design, React scaffolding, and full deployment. A real-time pair coding session from design to launch.
Austin Akers
May 2026
Livestream

1.5 Hour Build - Vibe Coding a Full Personal Site: Design to Deployment in One Session

Watch Austin rebuild his personal website live using Claude AI and Harper, including a custom Markdown CMS, GraphQL schema design, React scaffolding, and full deployment. A real-time pair coding session from design to launch.
Austin Akers
Livestream

1.5 Hour Build - Vibe Coding a Full Personal Site: Design to Deployment in One Session

Watch Austin rebuild his personal website live using Claude AI and Harper, including a custom Markdown CMS, GraphQL schema design, React scaffolding, and full deployment. A real-time pair coding session from design to launch.
Austin Akers
Blog
GitHub Logo

The Old Product Loop Is the New Bottleneck

AI has made it dramatically cheaper to get software to a working version, but most companies still plan like building is the expensive part. The new bottleneck is the product loop: forming sharp hypotheses, living inside the user experience, fixing friction as it appears, and feeding evidence back into the roadmap faster than ticket-based planning allows.
Blog
AI has made it dramatically cheaper to get software to a working version, but most companies still plan like building is the expensive part. The new bottleneck is the product loop: forming sharp hypotheses, living inside the user experience, fixing friction as it appears, and feeding evidence back into the roadmap faster than ticket-based planning allows.
Person with short dark hair and moustache, wearing a colorful plaid shirt, smiling outdoors in a forested mountain landscape.
Aleks Haugom
Senior Manager of GTM
Blog

The Old Product Loop Is the New Bottleneck

AI has made it dramatically cheaper to get software to a working version, but most companies still plan like building is the expensive part. The new bottleneck is the product loop: forming sharp hypotheses, living inside the user experience, fixing friction as it appears, and feeding evidence back into the roadmap faster than ticket-based planning allows.
Aleks Haugom
May 2026
Blog

The Old Product Loop Is the New Bottleneck

AI has made it dramatically cheaper to get software to a working version, but most companies still plan like building is the expensive part. The new bottleneck is the product loop: forming sharp hypotheses, living inside the user experience, fixing friction as it appears, and feeding evidence back into the roadmap faster than ticket-based planning allows.
Aleks Haugom
Blog

The Old Product Loop Is the New Bottleneck

AI has made it dramatically cheaper to get software to a working version, but most companies still plan like building is the expensive part. The new bottleneck is the product loop: forming sharp hypotheses, living inside the user experience, fixing friction as it appears, and feeding evidence back into the roadmap faster than ticket-based planning allows.
Aleks Haugom
Livestream
GitHub Logo

2 Hour Build - Live Stream for Non-Developers

A non-developer's live stream walkthrough of building Flow State, a Colorado river-flow app for rafters, in two hours using ChatGPT dictation, Claude Code, Claude Design, and Harper. Scaffold with npm create harper@latest and deploy to Harper Fabric. No coding background required.
Livestream
A non-developer's live stream walkthrough of building Flow State, a Colorado river-flow app for rafters, in two hours using ChatGPT dictation, Claude Code, Claude Design, and Harper. Scaffold with npm create harper@latest and deploy to Harper Fabric. No coding background required.
Person with short dark hair and moustache, wearing a colorful plaid shirt, smiling outdoors in a forested mountain landscape.
Aleks Haugom
Senior Manager of GTM
Livestream

2 Hour Build - Live Stream for Non-Developers

A non-developer's live stream walkthrough of building Flow State, a Colorado river-flow app for rafters, in two hours using ChatGPT dictation, Claude Code, Claude Design, and Harper. Scaffold with npm create harper@latest and deploy to Harper Fabric. No coding background required.
Aleks Haugom
May 2026
Livestream

2 Hour Build - Live Stream for Non-Developers

A non-developer's live stream walkthrough of building Flow State, a Colorado river-flow app for rafters, in two hours using ChatGPT dictation, Claude Code, Claude Design, and Harper. Scaffold with npm create harper@latest and deploy to Harper Fabric. No coding background required.
Aleks Haugom
Livestream

2 Hour Build - Live Stream for Non-Developers

A non-developer's live stream walkthrough of building Flow State, a Colorado river-flow app for rafters, in two hours using ChatGPT dictation, Claude Code, Claude Design, and Harper. Scaffold with npm create harper@latest and deploy to Harper Fabric. No coding background required.
Aleks Haugom
Tutorial
GitHub Logo

Production Quality at Vibe Code Velocity: Dispatched Agent Teams with Harper

Harper enables production-grade agentic engineering by collapsing database, cache, runtime, and messaging into one process, reducing agent complexity and review burden. A multi-model dispatch workflow lets specialized agents plan, code, QA, and review in parallel while humans retain control over critical decisions.
Tutorial
Harper enables production-grade agentic engineering by collapsing database, cache, runtime, and messaging into one process, reducing agent complexity and review burden. A multi-model dispatch workflow lets specialized agents plan, code, QA, and review in parallel while humans retain control over critical decisions.
Person with very short hair and a goatee wearing a plaid button‑up shirt over a white undershirt, smiling outdoors with leafy greenery behind.
Jeff Darnton
SVP, Professional Services & Customer Success
Tutorial

Production Quality at Vibe Code Velocity: Dispatched Agent Teams with Harper

Harper enables production-grade agentic engineering by collapsing database, cache, runtime, and messaging into one process, reducing agent complexity and review burden. A multi-model dispatch workflow lets specialized agents plan, code, QA, and review in parallel while humans retain control over critical decisions.
Jeff Darnton
May 2026
Tutorial

Production Quality at Vibe Code Velocity: Dispatched Agent Teams with Harper

Harper enables production-grade agentic engineering by collapsing database, cache, runtime, and messaging into one process, reducing agent complexity and review burden. A multi-model dispatch workflow lets specialized agents plan, code, QA, and review in parallel while humans retain control over critical decisions.
Jeff Darnton
Tutorial

Production Quality at Vibe Code Velocity: Dispatched Agent Teams with Harper

Harper enables production-grade agentic engineering by collapsing database, cache, runtime, and messaging into one process, reducing agent complexity and review burden. A multi-model dispatch workflow lets specialized agents plan, code, QA, and review in parallel while humans retain control over critical decisions.
Jeff Darnton