Click Below to Get the Code

Browse, clone, and build from real-world templates powered by Harper.
Blog
GitHub Logo

Myth Busted: Parallelization in our Node.js Database

This article debunks the myth that Node.js can’t handle parallelization because it’s single-threaded. It shows how Harper’s Node-based database uses worker threads and other techniques to achieve parallel processing and high performance.
Blog

Myth Busted: Parallelization in our Node.js Database

Eli Palmer
Engineer
at Harper
December 19, 2017
Eli Palmer
Engineer
at Harper
December 19, 2017
Eli Palmer
Engineer
at Harper
December 19, 2017
December 19, 2017
This article debunks the myth that Node.js can’t handle parallelization because it’s single-threaded. It shows how Harper’s Node-based database uses worker threads and other techniques to achieve parallel processing and high performance.
Eli Palmer
Engineer

Like any heavily adopted programming language, node.js has its critics.  Some of the criticism is accurate, but there’s a specific one that really grinds my gears.   

“JavaScript is single-threaded and therefore can’t do parallelization.”  There are variations on this depending on how granular or pedantic the critic gets, but the idea is the same.  The event loop is single-threaded, so the node is single-threaded.  As we at Harper have created a node.js database, it would be pretty crazy if we weren’t able to do parallelization.  Comparing a single node process versus a single java process, this is true and by design.  

However, Node provides a way around this by offering child processes and clustering in its core module.  Boilerplate code allows us to create a cluster of node processes on each core of your machine.  With some debatably minor architectural planning, we can parallelize nearly any computation or operation without the hassles of shared memory.  What’s better is that clustering is not constrained to the host machine's cores.  One can create a cluster of processes or a cluster of clusters that are distributed across a network. 

Cluster initialization looks like this:

const cluster = require('cluster');
const http = require('http');
const numCPUs = require('os').cpus().length;

if (cluster.isMaster) {
    console.log(`Master ${process.pid} is running`);

    // Fork workers.
   for (let i = 0; i < numCPUs; i++) {
       cluster.fork();
 }

     cluster.on('exit', (worker, code, signal) => {
     console.log(`worker ${worker.process.pid} died`);
 });
} else {
     // Workers can share any TCP connection
     // In this case it is an HTTP server
     http.createServer((req, res) => {
       res.writeHead(200);
       res.end('hello worldn');
 }).listen(8000);

     console.log(`Worker ${process.pid} started`);
}

 It is that easy.  This code was taken directly from the cluster docs.  You can find the core Clustering documentation here.  The cluster module duplicates the process running this code as many times as cluster.fork() is called.  The master process runs the code from line 1 to line 11, it simply sits and listens on a port. When a request comes in, it passes the request to one of the forked processes which run the code from line 12 to line 20.   

Clusters are excellent when you need to perform the same task repeatedly such as serving web requests.  Clusters run the same code across all processes,  and even though we can’t utilize shared memory we can send messages between processes. 

But what if we need to run different code?  We can’t spawn a cluster every time we need to pass some work off to another thread.  Here node provides the opportunity to spawn child processes that can execute commands or code specified in the initialization.  This trivial code will create 4 child processes, each of which will execute the code in hello.js.

// app.js
const child_process = require ('child_process');
.
.
.
// For simplicty we hard coded 4 as the number of processes we want to spawn. We could// also use numCPUs = require('os').cpus().length to query the system for the number  // of cores available.
for (let i =0; i<4; i++) {
   let my_process = child_process.fork("hello.js");

   my_process.on('message', (msg)=>{
     console.log(`Parent received message from pid ${my_process.pid}`);
     console.log(`${msg}`);
   });
}

As mentioned above, child processes and cluster processes are able to communicate with the parent process via a process communication channel.  In our example, our child process will invoke the hello.js module and will send a message to its parent, identifying itself.

//hello.js

console.log(`Hello world, I am ${process.pid}`);

 Our output shows 

Hello world, I am 1200
Hello world, I am 1198
Parent received message from pid 1200
Hi mom, I am proccess 1200
Parent received message from pid 1198
Hi mom, I am proccess 1198
Hello world, I am 1199
Parent received message from pid 1199
Hi mom, I am proccess 1199
Hello world, I am 1197
Parent received message from pid 1197
Hi mom, I am proccess 1197

From our output, we can see the results of calling process. send() in our child processes.  Node uses the handler we defined in my_process.on() to consume the process message and act on it.  Rather than using shared memory as we would in a more traditional language, we are able to pass data via this inter-process communication to facilitate parallelism.  We will get deeper into the details and performance gains of parallelism in a later blog post. However, by giving us more direct control over parallelism and threading we feel that node.js actually allows us to better control resource usage. This is critical for IoT database use cases as we discuss in our blog Growing Pains with Industrial IoT

Like any heavily adopted programming language, node.js has its critics.  Some of the criticism is accurate, but there’s a specific one that really grinds my gears.   

“JavaScript is single-threaded and therefore can’t do parallelization.”  There are variations on this depending on how granular or pedantic the critic gets, but the idea is the same.  The event loop is single-threaded, so the node is single-threaded.  As we at Harper have created a node.js database, it would be pretty crazy if we weren’t able to do parallelization.  Comparing a single node process versus a single java process, this is true and by design.  

However, Node provides a way around this by offering child processes and clustering in its core module.  Boilerplate code allows us to create a cluster of node processes on each core of your machine.  With some debatably minor architectural planning, we can parallelize nearly any computation or operation without the hassles of shared memory.  What’s better is that clustering is not constrained to the host machine's cores.  One can create a cluster of processes or a cluster of clusters that are distributed across a network. 

Cluster initialization looks like this:

const cluster = require('cluster');
const http = require('http');
const numCPUs = require('os').cpus().length;

if (cluster.isMaster) {
    console.log(`Master ${process.pid} is running`);

    // Fork workers.
   for (let i = 0; i &lt; numCPUs; i++) {
       cluster.fork();
 }

     cluster.on('exit', (worker, code, signal) =&gt; {
     console.log(`worker ${worker.process.pid} died`);
 });
} else {
     // Workers can share any TCP connection
     // In this case it is an HTTP server
     http.createServer((req, res) =&gt; {
       res.writeHead(200);
       res.end('hello worldn');
 }).listen(8000);

     console.log(`Worker ${process.pid} started`);
}

 It is that easy.  This code was taken directly from the cluster docs.  You can find the core Clustering documentation here.  The cluster module duplicates the process running this code as many times as cluster.fork() is called.  The master process runs the code from line 1 to line 11, it simply sits and listens on a port. When a request comes in, it passes the request to one of the forked processes which run the code from line 12 to line 20.   

Clusters are excellent when you need to perform the same task repeatedly such as serving web requests.  Clusters run the same code across all processes,  and even though we can’t utilize shared memory we can send messages between processes. 

But what if we need to run different code?  We can’t spawn a cluster every time we need to pass some work off to another thread.  Here node provides the opportunity to spawn child processes that can execute commands or code specified in the initialization.  This trivial code will create 4 child processes, each of which will execute the code in hello.js.

// app.js
const child_process = require ('child_process');
.
.
.
// For simplicty we hard coded 4 as the number of processes we want to spawn. We could// also use numCPUs = require('os').cpus().length to query the system for the number  // of cores available.
for (let i =0; i<4; i++) {
   let my_process = child_process.fork("hello.js");

   my_process.on('message', (msg)=>{
     console.log(`Parent received message from pid ${my_process.pid}`);
     console.log(`${msg}`);
   });
}

As mentioned above, child processes and cluster processes are able to communicate with the parent process via a process communication channel.  In our example, our child process will invoke the hello.js module and will send a message to its parent, identifying itself.

//hello.js

console.log(`Hello world, I am ${process.pid}`);

 Our output shows 

Hello world, I am 1200
Hello world, I am 1198
Parent received message from pid 1200
Hi mom, I am proccess 1200
Parent received message from pid 1198
Hi mom, I am proccess 1198
Hello world, I am 1199
Parent received message from pid 1199
Hi mom, I am proccess 1199
Hello world, I am 1197
Parent received message from pid 1197
Hi mom, I am proccess 1197

From our output, we can see the results of calling process. send() in our child processes.  Node uses the handler we defined in my_process.on() to consume the process message and act on it.  Rather than using shared memory as we would in a more traditional language, we are able to pass data via this inter-process communication to facilitate parallelism.  We will get deeper into the details and performance gains of parallelism in a later blog post. However, by giving us more direct control over parallelism and threading we feel that node.js actually allows us to better control resource usage. This is critical for IoT database use cases as we discuss in our blog Growing Pains with Industrial IoT

This article debunks the myth that Node.js can’t handle parallelization because it’s single-threaded. It shows how Harper’s Node-based database uses worker threads and other techniques to achieve parallel processing and high performance.

Download

White arrow pointing right
This article debunks the myth that Node.js can’t handle parallelization because it’s single-threaded. It shows how Harper’s Node-based database uses worker threads and other techniques to achieve parallel processing and high performance.

Download

White arrow pointing right
This article debunks the myth that Node.js can’t handle parallelization because it’s single-threaded. It shows how Harper’s Node-based database uses worker threads and other techniques to achieve parallel processing and high performance.

Download

White arrow pointing right

Explore Recent Resources

Blog
GitHub Logo

How a Shopify Custom Tie Shop Exposes a Common Flaw in Agent Architecture

Explore how a Shopify-based custom tie shop reveals a critical flaw in one LLM agent design strategy, and why context-first architectures with unified runtimes deliver faster, more accurate, and scalable customer support automation.
Blog
Explore how a Shopify-based custom tie shop reveals a critical flaw in one LLM agent design strategy, and why context-first architectures with unified runtimes deliver faster, more accurate, and scalable customer support automation.
Person with short dark hair and moustache, wearing a colorful plaid shirt, smiling outdoors in a forested mountain landscape.
Aleks Haugom
Senior Manager of GTM & Marketing
Blog

How a Shopify Custom Tie Shop Exposes a Common Flaw in Agent Architecture

Explore how a Shopify-based custom tie shop reveals a critical flaw in one LLM agent design strategy, and why context-first architectures with unified runtimes deliver faster, more accurate, and scalable customer support automation.
Aleks Haugom
Apr 2026
Blog

How a Shopify Custom Tie Shop Exposes a Common Flaw in Agent Architecture

Explore how a Shopify-based custom tie shop reveals a critical flaw in one LLM agent design strategy, and why context-first architectures with unified runtimes deliver faster, more accurate, and scalable customer support automation.
Aleks Haugom
Blog

How a Shopify Custom Tie Shop Exposes a Common Flaw in Agent Architecture

Explore how a Shopify-based custom tie shop reveals a critical flaw in one LLM agent design strategy, and why context-first architectures with unified runtimes deliver faster, more accurate, and scalable customer support automation.
Aleks Haugom
Blog
GitHub Logo

Nobody Wants to Pick a Data Center (And They Shouldn't Have To)

Harper Fabric simplifies cloud deployment by eliminating the need to choose data centers, automating infrastructure, scaling, and global distribution. Built for Harper’s unified runtime, it enables developers to deploy high-performance, distributed applications quickly without managing complex cloud configurations or infrastructure overhead.
Blog
Harper Fabric simplifies cloud deployment by eliminating the need to choose data centers, automating infrastructure, scaling, and global distribution. Built for Harper’s unified runtime, it enables developers to deploy high-performance, distributed applications quickly without managing complex cloud configurations or infrastructure overhead.
Headshot of a smiling woman with shoulder-length dark hair wearing a black sweater with white stripes and a gold pendant necklace, standing outdoors with blurred trees and mountains in the background.
Bari Jay
Senior Director of Product Management
Blog

Nobody Wants to Pick a Data Center (And They Shouldn't Have To)

Harper Fabric simplifies cloud deployment by eliminating the need to choose data centers, automating infrastructure, scaling, and global distribution. Built for Harper’s unified runtime, it enables developers to deploy high-performance, distributed applications quickly without managing complex cloud configurations or infrastructure overhead.
Bari Jay
Apr 2026
Blog

Nobody Wants to Pick a Data Center (And They Shouldn't Have To)

Harper Fabric simplifies cloud deployment by eliminating the need to choose data centers, automating infrastructure, scaling, and global distribution. Built for Harper’s unified runtime, it enables developers to deploy high-performance, distributed applications quickly without managing complex cloud configurations or infrastructure overhead.
Bari Jay
Blog

Nobody Wants to Pick a Data Center (And They Shouldn't Have To)

Harper Fabric simplifies cloud deployment by eliminating the need to choose data centers, automating infrastructure, scaling, and global distribution. Built for Harper’s unified runtime, it enables developers to deploy high-performance, distributed applications quickly without managing complex cloud configurations or infrastructure overhead.
Bari Jay
Blog
GitHub Logo

New RocksDB Binding for Node.js

rocksdb-js is a modern Node.js binding for RocksDB, offering full transaction support, lazy range queries, and a TypeScript API. Built for performance and scalability, it enables reliable write-heavy workloads, real-time replication, and high-concurrency applications in Harper 5.0 and beyond.
Blog
rocksdb-js is a modern Node.js binding for RocksDB, offering full transaction support, lazy range queries, and a TypeScript API. Built for performance and scalability, it enables reliable write-heavy workloads, real-time replication, and high-concurrency applications in Harper 5.0 and beyond.
Person with short hair and rectangular glasses wearing a plaid shirt over a dark T‑shirt, smiling broadly with a blurred outdoor background of trees and hills.
Chris Barber
Staff Software Engineer
Blog

New RocksDB Binding for Node.js

rocksdb-js is a modern Node.js binding for RocksDB, offering full transaction support, lazy range queries, and a TypeScript API. Built for performance and scalability, it enables reliable write-heavy workloads, real-time replication, and high-concurrency applications in Harper 5.0 and beyond.
Chris Barber
Apr 2026
Blog

New RocksDB Binding for Node.js

rocksdb-js is a modern Node.js binding for RocksDB, offering full transaction support, lazy range queries, and a TypeScript API. Built for performance and scalability, it enables reliable write-heavy workloads, real-time replication, and high-concurrency applications in Harper 5.0 and beyond.
Chris Barber
Blog

New RocksDB Binding for Node.js

rocksdb-js is a modern Node.js binding for RocksDB, offering full transaction support, lazy range queries, and a TypeScript API. Built for performance and scalability, it enables reliable write-heavy workloads, real-time replication, and high-concurrency applications in Harper 5.0 and beyond.
Chris Barber
Blog
GitHub Logo

Open Sourcing Harper

Harper is now open source, with its core platform released under Apache 2.0 and enterprise features source-available. This shift builds trust, enables community contributions, and positions Harper as a unified, transparent platform for developers and AI-driven applications.
Blog
Harper is now open source, with its core platform released under Apache 2.0 and enterprise features source-available. This shift builds trust, enables community contributions, and positions Harper as a unified, transparent platform for developers and AI-driven applications.
Person with shoulder‑length curly brown hair and light beard wearing a gray long‑sleeve shirt, smiling outdoors with trees and greenery in the background.
Ethan Arrowood
Senior Software Engineer
Blog

Open Sourcing Harper

Harper is now open source, with its core platform released under Apache 2.0 and enterprise features source-available. This shift builds trust, enables community contributions, and positions Harper as a unified, transparent platform for developers and AI-driven applications.
Ethan Arrowood
Apr 2026
Blog

Open Sourcing Harper

Harper is now open source, with its core platform released under Apache 2.0 and enterprise features source-available. This shift builds trust, enables community contributions, and positions Harper as a unified, transparent platform for developers and AI-driven applications.
Ethan Arrowood
Blog

Open Sourcing Harper

Harper is now open source, with its core platform released under Apache 2.0 and enterprise features source-available. This shift builds trust, enables community contributions, and positions Harper as a unified, transparent platform for developers and AI-driven applications.
Ethan Arrowood
Blog
GitHub Logo

The Resource API in Harper v5: HTTP Done Right

Harper v5's Resource API maps JavaScript class methods directly to HTTP verbs, eliminating routing and translation layers. Tables extend the same Resource class, unifying HTTP handling and data access into one interface. Key v5 additions include pre-parsed RequestTarget objects, Response-aware source caching with stale-while-revalidate support, and async context tracking via getContext().
Product Update
Blog
Harper v5's Resource API maps JavaScript class methods directly to HTTP verbs, eliminating routing and translation layers. Tables extend the same Resource class, unifying HTTP handling and data access into one interface. Key v5 additions include pre-parsed RequestTarget objects, Response-aware source caching with stale-while-revalidate support, and async context tracking via getContext().
Person with very short blonde hair wearing a light gray button‑up shirt, standing with arms crossed and smiling outdoors with foliage behind.
Kris Zyp
SVP of Engineering
Blog

The Resource API in Harper v5: HTTP Done Right

Harper v5's Resource API maps JavaScript class methods directly to HTTP verbs, eliminating routing and translation layers. Tables extend the same Resource class, unifying HTTP handling and data access into one interface. Key v5 additions include pre-parsed RequestTarget objects, Response-aware source caching with stale-while-revalidate support, and async context tracking via getContext().
Kris Zyp
Apr 2026
Blog

The Resource API in Harper v5: HTTP Done Right

Harper v5's Resource API maps JavaScript class methods directly to HTTP verbs, eliminating routing and translation layers. Tables extend the same Resource class, unifying HTTP handling and data access into one interface. Key v5 additions include pre-parsed RequestTarget objects, Response-aware source caching with stale-while-revalidate support, and async context tracking via getContext().
Kris Zyp
Blog

The Resource API in Harper v5: HTTP Done Right

Harper v5's Resource API maps JavaScript class methods directly to HTTP verbs, eliminating routing and translation layers. Tables extend the same Resource class, unifying HTTP handling and data access into one interface. Key v5 additions include pre-parsed RequestTarget objects, Response-aware source caching with stale-while-revalidate support, and async context tracking via getContext().
Kris Zyp
News
GitHub Logo

Harper 5.0 Is Here: Open Source, RocksDB, and a Runtime Built for the Agentic Era

Harper 5.0 launches with a fully open-source core under Apache 2.0, RocksDB as a native storage engine alongside LMDB, and source-available Harper Pro. This release delivers a unified runtime purpose-built for agentic engineering, from prototype to production.
Product Update
News
Harper 5.0 launches with a fully open-source core under Apache 2.0, RocksDB as a native storage engine alongside LMDB, and source-available Harper Pro. This release delivers a unified runtime purpose-built for agentic engineering, from prototype to production.
Person with short dark hair and moustache, wearing a colorful plaid shirt, smiling outdoors in a forested mountain landscape.
Aleks Haugom
Senior Manager of GTM & Marketing
News

Harper 5.0 Is Here: Open Source, RocksDB, and a Runtime Built for the Agentic Era

Harper 5.0 launches with a fully open-source core under Apache 2.0, RocksDB as a native storage engine alongside LMDB, and source-available Harper Pro. This release delivers a unified runtime purpose-built for agentic engineering, from prototype to production.
Aleks Haugom
Apr 2026
News

Harper 5.0 Is Here: Open Source, RocksDB, and a Runtime Built for the Agentic Era

Harper 5.0 launches with a fully open-source core under Apache 2.0, RocksDB as a native storage engine alongside LMDB, and source-available Harper Pro. This release delivers a unified runtime purpose-built for agentic engineering, from prototype to production.
Aleks Haugom
News

Harper 5.0 Is Here: Open Source, RocksDB, and a Runtime Built for the Agentic Era

Harper 5.0 launches with a fully open-source core under Apache 2.0, RocksDB as a native storage engine alongside LMDB, and source-available Harper Pro. This release delivers a unified runtime purpose-built for agentic engineering, from prototype to production.
Aleks Haugom