Expert Node.js Support
Learn more

node.js at scale

The Definitive Guide for Monitoring Node.js Applications

The Definitive Guide for Monitoring Node.js Applications

In the previous chapters of Node.js at Scale we learned how you can get Node.js testing and TDD right, and how you can use Nightwatch.js for end-to-end testing.

In this article, we will learn about running and monitoring Node.js applications in Production. Let's discuss these topics:

  • What is monitoring?
  • What should be monitored?
  • Open-source monitoring solutions
  • SaaS and On-premise monitoring offerings

What is Node.js Monitoring?

Monitoring is observing the quality of a software over time. The available products and tools we have in this industry usually go by the term Application Performance Monitoring or APM in short.

If you have a Node.js application in a staging or production environment, you can (and should) do monitoring on different levels:

You can monitor

  • regions,
  • zones,
  • individual servers and,
  • of course, the Node.js software that runs on them.

In this guide we will deal with the software components only, as if you run in a cloud environment, the others are taken care for you usually.

What should be monitored?

Each application you write in Node.js produces a lot of data about its' behavior.

There are different layers where an APM tool should collect data from. The more of them covered, the more insights you'll get about your system's behavior.

  • Service level
  • Host level
  • Instance (or process) level

The list you can find below collects the most crucial problems you'll run into while you maintain a Node.js application in production. We'll also discuss how monitoring helps to solve them and what kind of data you'll need to do so.

Problem 1.: Service Downtimes

If your application is unavailable, your customers can't spend money on your sites. If your API's are down, your business partners and services depending on them will fail as well because of you.

We all know how cringeworthy it is to apologize for service downtimes.

Your topmost priority should be preventing failures and providing 100% availability for your application.

Running a production app comes with great responsibility.

Node.js APM's can easily help you detecting and preventing downtimes, since they usually collect service level metrics.

This data can show if your application handles requests properly, although it won't always help to tell if your public sites or API's are available.

To have a proper coverage on downtimes, we recommend to set up a pinger as well which can emulate user behavior and provide foolproof data on availability. If you want to cover everything, don't forget to include different regions like the US, Europe and Asia too.

Problem 2.: Slow Services, Terrible Response Times

Slow response times have a huge impact on conversion rate, as well as on product usage. The faster your product is the more customers and user satisfaction you'll have.

Usually, all Node.js APM's can show if your services are slowing down, but interpreting that data requires further analysis.

I recommend doing two things to find the real reasons for slowing services.

  • Collect data on a process level too. Check out each instance of a service to figure out what happens under the hood.
  • Request CPU profiles when your services slow down and analyze them to find the faulty functions.

Eliminating performance bottlenecks enables you to scale your software more efficiently and also to optimize your budget.

Problem 3.: Solving Memory Leaks is Hard

Our Node.js Consulting & Development expertise allowed us to build huge enterprise systems and help developers making them better.

What we see constantly is that Memory Leaks in Node.js applications are quite frequent and that finding out what causes them is among the greatest struggles Node developers face.

This impression is backed with data as well. Our Node.js Developer Survey showed that Memory Leaks cause a lot of headache for even the best engineers.

To find memory leaks, you have to know exactly when they happen.

Some APM's collect memory usage data which can be used to recognize a leak. What you should look for is the steady growth of memory usage which ends up in a service crash & restart (as Node runs out of memory after 1,4 Gigabytes).

Node.js memory leak shown in Trace, the node.js monitoring tool

If your APM collects data on the Garbage Collector as well, you can look for the same pattern. As extra objects in a Node app's memory pile up, the time spent with Garbage Collection increases simultaneously. This is a great indicator of the Memory Leak.

After figuring out that you have a leak, request a memory heapdump and look for the extra objects!

This sounds easy in theory but can be challenging in practice.

What you can do is request 2 heapdumps from your production system with a Monitoring tool, and analyze these dumps with Chrome's DevTools. If you look for the extra objects in comparison mode, you'll end up seeing what piles up in your app's memory.

If you'd like a more detailed rundown on these steps, I wrote one article about finding a Node.js memory leak in Ghost, where I go into more details.

Problem 4.: Depending on Code Written by Anonymus

Most of the Node.js applications heavily rely on npm. We can end up with a lot of dependencies written by developers of unknown expertise and intentions.

Roughly 76% of Node shops use vulnerable packages, while open source projects regularly grow stale, neglecting to fix security flaws.

There are a couple of possible steps to lower the security risks of using npm packages.

  1. Audit your modules with the Node Security Platform CLI
  2. Look for unused dependencies with the depcheck tool
  3. Use the npm stats API, or browse historic stats on to find out if others using a package
  4. Use the npm view <pkg> maintainers command to avoid packages maintained by only a few
  5. Use the npm outdated command or Greenkeeper to learn whether you're using the latest version of a package.

Going through these steps can consume a lot of your time, so picking a Node.js Monitoring Tool which can warn you about insecure dependencies is highly recommended.

Problem 6.: Email Alerts often go Unnoticed

Let's be honest. We are developers who like spending time writing code - not going through our email account every 10 minutes..

According to my experience, email alerts are usually unread and it's very easy to miss out on a major outage or problem if we depend only on them.

Email is a subpar method to learn about issues in production.

I guess that you also don't want to watch dashboards for potential issues 24/7. This is why it is important to look for an APM with great alerting capabilities.

What I recommend is to use pager systems like opsgenie or pagerduty to learn about critical issues. Pair up the monitoring solution of your choice with one of these systems if you'd like to know about your alerts instantly.

A few alerting best-practices we follow at RisingStack:

  • Always keep alerting simple and alert on symptoms
  • Aim to have as few alerts as possible - associated with end-user pain
  • Alert on high response time and error rates as high up in the stack as possible

Problem 7.: Finding Crucial Errors in the Code

If a feature is broken on your site, it can prevent customers from achieving their goals. Sometimes it can be a sign of bad code quality. Make sure you have proper test coverage for your codebase and a good QA process (preferably automated).

If you use an APM that collects errors from your app then you'll be able to find the ones which occur more often.

The more data your APM is accessing, better the chances of finding and fixing critical issues. We recommend to use a monitoring tool which collects and visualises stack traces as well - so you'll be able to find the root causes of errors in a distributed system.

In the next part of the article, I will show you one open-source, and one SaaS / on-premises Node.js monitoring solution that will help you operate your applications.

Prometheus - an Open-Source, General Purpose Monitoring Platform

Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud.

Prometheus was started in 2012, and since then, many companies and organizations have adopted the tool. It is a standalone open source project and maintained independently of any company.

In 2016, Prometheus joined the Cloud Native Computing Foundation, right after Kubernetes.

The most important features of Prometheus are:

  • a multi-dimensional data model (time series identified by metric name and key/value pairs),
  • a flexible query language to leverage this dimensionality,
  • time series collection happens via a pull model over HTTP by default,
  • pushing time series is supported via an intermediary gateway.

Node.js monitoring with prometheus

As you could see from the previous features, Prometheus is a general purpose monitoring solution, so you can use it with any language or technology you prefer.

Check out the official Prometheus getting started pages if you'd like to give it a try.

Before you start monitoring your Node.js services, you need to add instrumentation to them via one of the Prometheus client libraries.

For this, there is a Node.js client module, which you can find here. It supports histograms, summaries, gauges and counters.

Essentially, all you have to do is require the Prometheus client, then expose its output at an endpoint:

const Prometheus = require('prom-client')  
const server = require('express')()

server.get('/metrics', (req, res) => {  

server.listen(process.env.PORT || 3000)  

This endpoint will produce an output, that Prometheus can consume - something like this:

# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1490433285  
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 33046528  
# HELP nodejs_eventloop_lag_seconds Lag of event loop in seconds.
# TYPE nodejs_eventloop_lag_seconds gauge
nodejs_eventloop_lag_seconds 0.000089751  
# HELP nodejs_active_handles_total Number of active handles.
# TYPE nodejs_active_handles_total gauge
nodejs_active_handles_total 4  
# HELP nodejs_active_requests_total Number of active requests.
# TYPE nodejs_active_requests_total gauge
nodejs_active_requests_total 0  
# HELP nodejs_version_info Node.js version info.
# TYPE nodejs_version_info gauge
nodejs_version_info{version="v4.4.2",major="4",minor="4",patch="2"} 1  

Of course, these are just the default metrics which were collected by the module we have used - you can extend it with yours. In the example below we collect the number of requests served:

const Prometheus = require('prom-client')  
const server = require('express')()

const PrometheusMetrics = {  
  requestCounter: new Prometheus.Counter('throughput', 'The number of requests served')

server.use((req, res, next) => {

server.get('/metrics', (req, res) => {  


Once you run it, the /metrics endpoint will include the throughput metrics as well:

# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1490433805  
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 25120768  
# HELP nodejs_eventloop_lag_seconds Lag of event loop in seconds.
# TYPE nodejs_eventloop_lag_seconds gauge
nodejs_eventloop_lag_seconds 0.144927586  
# HELP nodejs_active_handles_total Number of active handles.
# TYPE nodejs_active_handles_total gauge
nodejs_active_handles_total 0  
# HELP nodejs_active_requests_total Number of active requests.
# TYPE nodejs_active_requests_total gauge
nodejs_active_requests_total 0  
# HELP nodejs_version_info Node.js version info.
# TYPE nodejs_version_info gauge
nodejs_version_info{version="v4.4.2",major="4",minor="4",patch="2"} 1  
# HELP throughput The number of requests served
# TYPE throughput counter
throughput 5  

Once you have exposed all the metrics you have, you can start querying and visualizing them - for that, please refer to the official Prometheus query documentation and the vizualization documentation.

As you can imagine, instrumenting your codebase can take quite some time - since you have to create your dashboard and alerts to make sense of the data. While sometimes these solutions can provide greater flexibility for your use-case than hosted solutions, it can take months to implement them & then you have to deal with operating them as well.

If you have the time to dig deep into the topic, you'll be fine with it.

Meet Trace - our SaaS, and On-premises Node.js Monitoring Tool

As we just discussed, running your own solution requires domain knowledge, as well as expertise on how to do proper monitoring. You have to figure out what aggregation to use for what kind of metrics, and so on..

This is why it can make a lot of sense to go with a hosted monitoring solution - whether it is a SaaS product or an on-premises offering.

At RisingStack, we are developing our own Node.js Monitoring Solution, called Trace. We built all the experience into Trace which we gained through the years of providing professional Node services.

What's nice about Trace, is that you have all the metrics you need with only adding a single line of code to your application - so it really takes only a few seconds to get started.

require([email protected]/trace')  

After this, the Trace collector automatically gathers your application's performance data and visualizes it for you in an easy to understand way.

Just a few things Trace is capable to do with your production Node app:

  1. Send alerts about Downtimes, Slow services & Bad Status Codes.
  2. Ping your websites and API's with an external service + show APDEX metrics.
  3. Collect data on service, host and instance levels as well.
  4. Automatically create a (10 second-long) CPU profile in a production environment in case of a slowdown.
  5. Collect data on memory consumption and garbage collection.
  6. Create memory heapdumps automatically in case of a Memory Leak in production.
  7. Show errors and stack traces from your application.
  8. Visualize whole transaction call-chains in a distributed system.
  9. Show how your services communicate with each other on a live map.
  10. Automatically detect npm packages with security vulnerabilities.
  11. Mark new deployments and measure their effectiveness.
  12. Integrate with Slack, Pagerduty, and Opsgenie - so you'll never miss an alert.

Although Trace is currently a SaaS solution, we'll make an on-premises version available as well soon.

It will be able to do exactly the same as the cloud version, but it will run on Amazon VPC or in your own datacenter. If you're interested in it, let's talk!


I hope that in this chapter of Node.js at Scale I was able to give useful advice about monitoring your Node.js application. In the next article, you will learn how to debug Node.js applications in an easy way.

Node.js End-to-End Testing with Nightwatch.js

Node.js End-to-End Testing with Nightwatch.js

In this article, we are going to take a look at how you can do end-to-end testing with Node.js, using Nightwatch.js, a Node.js powered end-to-end testing framework.

In the previous chapter of Node.js at Scale, we discussed Node.js Testing and Getting TDD Right. If you did not read that article, or if you are unfamiliar with unit testing and TDD (test-driven development), I recommend checking that out before continuing with this article.

What is Node.js end-to-end testing?

Before jumping into example codes and learning to implement end-to-end testing for a Node.js project, it's worth exploring what end-to-end tests really are.

First of all, end-to-end testing is part of the black-box testing toolbox. This means that as a test writer, you are examining functionality without any knowledge of internal implementation. So without seeing any source code.

Secondly, end-to-end testing can also be used as user acceptance testing, or UAT. UAT is the process of verifying that the solution actually works for the user. This process is not focusing on finding small typos, but issues that can crash the system, or make it dysfunctional for the user.

Enter Nightwatch.js

Nightwatch.js enables you to "write end-to-end tests in Node.js quickly and effortlessly that run against a Selenium/WebDriver server".

Nightwatch is shipped with the following features:

  • a built-in test runner,
  • can control the selenium server,
  • support for hosted selenium providers, like BrowserStack or SauceLabs,
  • CSS and Xpath selectors.

Installing Nightwatch

To run Nightwatch locally, we have to do a little bit of extra work - we will need a standalone Selenium server locally, as well as a webdriver, so we can use Chrome/Firefox to test our applications locally.

With these three tools, we are going to implement the flow this diagram shows below.

node.js end-to-end testing with nightwatch.js flowchart Photo credit:

STEP 1: Add Nightwatch

You can add Nightwatch to your project simply by running npm install nightwatch --save-dev.

This places the Nightwatch executable in your ./node_modules/.bin folder, so you don't have to install it globally.

STEP 2: Download Selenium

Selenium is a suite of tools to automate web browsers across many platforms.

Prerequisite: make sure you have JDK installed, with at least version 7. If you don't have it, you can grab it from here.

The Selenium server is a Java application which is used by Nightwatch to connect to various browsers. You can download the binary from here.

Once you have downloaded the JAR file, create a bin folder inside your project, and place it there. We will set up Nightwatch to use it, so you don't have to manually start the Selenium server.

STEP 3: Download Chromedriver

ChromeDriver is a standalone server which implements the W3C WebDriver wire protocol for Chromium.

To grab the executable, head over to the downloads section, and place it to the same bin folder.

STEP 4: Configuring Nightwatch.js

The basic Nightwatch configuration happens through a json configuration file.

Let's create a nightwatch.json file, and fill it with:

  "src_folders" : ["tests"],
  "output_folder" : "reports",

  "selenium" : {
    "start_process" : true,
    "server_path" : "./bin/selenium-server-standalone-3.3.1.jar",
    "log_path" : "",
    "port" : 4444,
    "cli_args" : {
      "" : "./bin/chromedriver"

  "test_settings" : {
    "default" : {
      "launch_url" : "http://localhost",
      "selenium_port"  : 4444,
      "selenium_host"  : "localhost",
      "desiredCapabilities": {
        "browserName": "chrome",
        "javascriptEnabled": true,
        "acceptSslCerts": true

With this configuration file, we told Nightwatch where can it find the binary of the Selenium server and the Chromedriver, as well as the location of the tests we want to run.

You shouldn't rely only on e2e testing for QA. Trace helps you to find all issues before your users do.

Node.js monitoring & debugging from the experts of RisingStack
Learn more

Quick Recap

So far, we have installed Nightwatch, downloaded the standalone Selenium server, as well as the Chromedriver. With these steps, you have all the necessary tools to create end-to-end tests using Node.js and Selenium.

Writing your first Nightwatch Test

Let's add a new file in the tests folder, called homepage.js.

We are going to take the example from the Nightwatch getting started guide. Our test script will go to Google, search for Rembrandt, and check the Wikipedia page:

module.exports = {  
  'Demo test Google' : function (client) {
      .waitForElementVisible('body', 1000)
      .setValue('input[type=text]', 'rembrandt van rijn')
      .waitForElementVisible('button[name=btnG]', 1000)
      .assert.containsText('ol#rso li:first-child',
        'Rembrandt - Wikipedia')

The only thing left to do is to run Nightwatch itself! For that, I recommend adding a new script into our package.json's scripts section:

"scripts": {
  "test-e2e": "nightwatch"

The very last thing you have to do is to run the tests using this command:

npm run test-e2e  

If everything goes well, your test will open up Chrome, then Google and Wikipedia.

Nightwatch.js in Your Project

Now as you understood what end-to-end testing is, and how you can set up Nightwatch, it is time to start adding it to your project.

For that, you have to consider some aspects - but please note, that there are no silver bullets here. Depending on your business needs, you may answer the following questions differently:

  • Where should I run? On staging? On production? When don I build my containers?
  • What are the test scenarios I want to test?
  • When and who should write end-to-end tests?

Summary & Next Up

In this chapter of Node.js at Scale we have learned:

  • how to set up Nightwatch,
  • how to configure it to use a standalone Selenium server,
  • and how to write basic end-to-end tests.

In the next chapter, we are going to explore how you can monitor production Node.js infrastructures.

CQRS  Explained

CQRS Explained

What is CQRS?

CQRS is an architectural pattern, where the acronym stands for Command Query Responsibility Segregation. We can talk about CQRS when the data read operations are separated from the data write operations, and they happen on a different interface.

In most of the CQRS systems, read and write operations use different data models, sometimes even different data stores. This kind of segregation makes it easier to scale, read and write operations and to control security - but adds extra complexity to your system.

Node.js at Scale is a collection of articles focusing on the needs of companies with bigger Node.js installations and advanced Node developers. Chapters:

The level of segregation can vary in CQRS systems:

  • single data stores and separated model for reading and updating data
  • separated data stores and separated model for reading and updating data

In the simplest data store separation, we can use read-only replicas to achieve segregation.

Why and when to use CQRS?

In a typical data management system, all CRUD (Create Read Update Delete) operations are executed on the same interface of the entities in a single data storage. Like creating, updating, querying and deleting table rows in an SQL database via the same model.

CQRS really shines compared to the traditional approach (using a single model) when you build complex data models to validate and fulfil your business logic when data manipulation happens. Read operations compared to update and write operations can be very different or much simpler - like accessing a subset of your data only.

Real world example

In our Node.js Monitoring Tool, we use CQRS to segregate saving and representing the data. For example, when you see a distributed tracing visualization on our UI, the data behind it arrived in smaller chunks from our customers application agents to our public collector API.

In the collector API, we only do a thin validation and send the data to a messaging queue for processing. On the other end of the queue, workers are consuming messages and resolving all the necessary dependencies via other services. These workers are also saving the transformed data to the database.

If any issue happens, we send back the message with exponential backoff and max limit to our messaging queue. Compared to this complex data writing flow, on the representation side of the flow, we only query a read-replica database and visualize the result to our customers.

Microservice with CQRS Trace by RisingStack data processing with CQRS

CQRS and Event Sourcing

I've seen many times that people are confusing these two concepts. Both of them are heavily used in event driven infrastructures like in an event driven microservices, but they mean very different things.

To read more about Event Sourcing with Examples, check out our previous Node.js at Scale article.

Download the whole building with Node.js series as a single pdf

Reporting database - Denormalizer

In some event driven systems, CQRS is implemented in a way that the system contains one or multiple Reporting databases.

A Reporting database is an entirely different read-only storage that models and persists the data in the best format for representing it. It's okay to store it in a denormalized format to optimize it for the client needs. In some cases, the reporting database contains only derived data, even from multiple data sources.

In a microservices architecture, we call a service the Denormalizer if it listens for some events and maintains a Reporting Database based on these. The client is reading the denormalized service's reporting database.

An example can be that the user profile service emits a user.edit event with { id: 1, name: 'John Doe', state: 'churn' } payload, the Denormalizer service listens to it but only stores the { name: 'John Doe' } in its Reporting Database, because the client is not interested in the internal state churn of the user.

It can be hard to keep a Reporting Database in sync. Usually, we can only aim to eventual consistency.

A CQRS Node.js Example Repo

For our CQRS with Denormalizer Node.js example visit our cqrs-example GitHub repository.

CQRS Example


CQRS is a powerful architectural pattern to segregate read and write operations and their interfaces, but it also adds extra complexity to your system. In most of the cases, you shouldn't use CQRS for the whole system, only for specific parts where the complexity and scalability make it necessary.

To read more about CQRS and Reporting databases, I recommend to check out these resources:

In the next chapter of the Node.js at Scale series we'll discuss Node.js Testing and Getting TDD Right. Read on! :)

I’m happy to answer your CQRS related questions in the comments section!

Event Sourcing with Examples in Node.js

Event Sourcing with Examples in Node.js

Event Sourcing is a powerful architectural pattern to handle complex application states that may need to be rebuilt, re-played, audited or debugged.

From this article you can learn what Event Sourcing is, and when should you use it. We’ll also take a look at some Event sourcing examples with code snippets.

Node.js at Scale is a collection of articles focusing on the needs of companies with bigger Node.js installations and advanced Node developers. Chapters:

Event Sourcing

Event Sourcing is a software architecture pattern which makes it possible to reconstruct past states (latest state as well). It's achieved in a way that every state change gets stored as a sequence of events.

The State of your application is like a user's account balance or subscription at a particular time. This current state may only exist in memory.

Good examples for Event Sourcing are version control systems that stores current state as diffs. The current state is your latest source code, and events are your commits.

Why is Event Sourcing useful?

In our hypothetical example, you are working on an online money transfer site, where every customer has an account balance. Imagine that you just started working on a beautiful Monday morning when it suddenly turns out that you made a mistake and used a wrong currency exchange for the whole past week. In this case, every account which sent and received money in a last seven days are in a corrupt state.

With event sourcing, there’s no need to panic!

If your site uses event sourcing, you can revert the account balances to their previous uncorrupted state, fix the exchange rate and replay all the events until now. That's it, your job and reputation is saved!

Node.js Monitoring and Debugging from the Experts of RisingStack

See how deployments affect the performance of your production environment.
Learn more

Other use-cases

You can use events to audit or debug state changes in your system. They can also be useful for handling SaaS subscriptions. In a usual subscription based system, your users can buy a plan, upgrade it, downgrade it, pro-rate a current price, cancel a plan, apply a coupon, and so on... A good event log can be very useful to figure out what happened.

So with event sourcing you can:

  • Rebuild states completely
  • Replay states from a specific time
  • Reconstruct the state of a specific moment for temporary query

What is an Event?

An Event is something that happened in the past. An Event is not a snapshot of a state at a specific time; it's the action itself with all the information that's necessary to replay it.

Events should be a simple object which describes some action that occurred. They should be immutable and stored in an append-only way. Their immutable append-only nature makes them suitable to use as audit logs too.

This is what makes possible to undo and redo events or even replay them from a specific timestamp.

Be careful with External Systems!

As any software pattern, Event Sourcing can be challenging at some points as well.

The external systems that your application communicates with are usually not prepared for event sourcing, so you should be careful when you replay your events. I’m sure that you don’t wish to charge your customers twice or send all welcome emails again.

To solve this challenge, you should handle replays in your communication layers!

Command Sourcing

Command Sourcing is a different approach from Event Sourcing - make sure you don’t mix ‘em up by accident!

Event Sourcing:

  • Persist only changes in state
  • Replay can be side-effect free

Command Sourcing:

  • Persist Commands
  • Replay may trigger side-effects

Example for Event Sourcing

In this simple example, we will apply Event Sourcing for our accounts:

// current account states (how it looks in our DB now)
const accounts = {  
  account1: { balance: 100 },
  account2: { balance: 50 }
// past events (should be persisted somewhere, for example in a DB)
const events = [  
  { type: 'open', id: 'account1', balance: 150, time: 0 },
  { type: 'open', id: 'account2', balance: 0, time: 1 },
  { type: 'transfer', fromId: 'account1', toId: 'account2': amount: 50, time: 2 }

Let's rebuild the latest state from scratch, using our event log:

// complete rebuild
const accounts = events.reduce((accounts, event) => {  
  if (event.type === 'open') {
    accounts[].balance = event.balance
  } else if (event.type === 'transfer') {
    accounts[event.fromId].balance -= event.amount
    accounts[event.toId].balance += event.amount
  return accounts
}, {})

Undo the latest event:

// undo last event
const accounts = events.splice(-1).reduce((accounts, event) => {  
  if (event.type === 'open') {
    delete accounts[]
  } else if (event.type === 'transfer') {
    accounts[event.fromId].balance += event.amount
    accounts[event.toId].balance -= event.amount
  return accounts
}, {})

Query accounts state at a specific time:

// query specific time
function getAccountsAtTime (time) {  
  return events.reduce((accounts, event) => {
    if (time > event.time {
      return accounts

    if (event.type === 'open') {
      accounts[].balance = event.balance
    } else if (event.type === 'transfer') {
      accounts[event.fromId].balance -= event.amount
      accounts[event.toId].balance += event.amount
    return accounts
  }, {})

const accounts = getAccountsAtTime(1)  

Download the whole building with Node.js series as a single pdf

Learning more..

For more detailed examples, you can check out our Event Sourcing Example repository.

For more general and deeper understanding of Event Sourcing I recommend to read these articles:

In the next part of the Node.js at Scale series, we’ll learn about Command Query Responsibility Segregation. Make sure you check back in a week!

If you have any questions on this topic, please let me know in the comments section below!

Node.js Async Best Practices & Avoiding the Callback Hell

Node.js Async Best Practices & Avoiding the Callback Hell

In this post, we cover what tools and techniques you have at your disposal when handling Node.js asynchronous operations: async.js, promises, generators and async functions.

After reading this article, you’ll know how to avoid the despised callback hell!

Node.js at Scale is a collection of articles focusing on the needs of companies with bigger Node.js installations and advanced Node developers. Chapters:

Asynchronous programming in Node.js

Previously we have gathered a strong knowledge about asynchronous programming in JavaScript and understood how the Node.js event loop works.

If you did not read these articles, I highly recommend them as introductions!

The Problem with Node.js Async

Node.js itself is single threaded, but some tasks can run parallelly - thanks to its asynchronous nature.

But what does running parallelly mean in practice?

Since we program a single threaded VM, it is essential that we do not block execution by waiting for I/O, but handle them concurrently with the help of Node.js's event driven APIs.

Let’s take a look at some fundamental patterns, and learn how we can write resource efficient, non-blocking code, with the built-in solutions of Node.js and some third-party libraries.

The Classical Approach - Callbacks

Let's take a look at these simple async operations. They do nothing special, just fire a timer and call a function once the timer finished.

function fastFunction (done) {  
  setTimeout(function () {
  }, 100)

function slowFunction (done) {  
  setTimeout(function () {
  }, 300)

Seems easy, right?

Our higher-order functions can be executed sequentially or parallelly with the basic "pattern" by nesting callbacks - but using this method can lead to an untameable callback-hell.

function runSequentially (callback) {  
  fastFunction((err, data) => {
    if (err) return callback(err)
    console.log(data)   // results of a

    slowFunction((err, data) => {
      if (err) return callback(err)
      console.log(data) // results of b

      // here you can continue running more tasks

Avoiding Callback Hell with Control Flow Managers

To become an efficient Node.js developer, you have to avoid the constantly growing indentation level, produce clean and readable code and be able to handle complex flows.

Let me show you some of the libraries we can use to organize our code in a nice and maintainable way!

Node.js Monitoring and Debugging from the Experts of RisingStack

Concurrency issues in production? Trace can help!
Learn more

#1: Meet the Async Module

Async is a utility module which provides straight-forward, powerful functions for working with asynchronous JavaScript.

Async contains some common patterns for asynchronous flow control with the respect of error-first callbacks.

Let's see how our previous example would look like using async!

async.waterfall([fastFunction, slowFunction], () => {  

What kind of witchcraft just happened?

Actually, there is no magic to reveal. You can easily implement your async job-runner which can run tasks parallelly and wait for each to be ready.

Let's take a look at what async does under the hood!

// taken from
function(tasks, callback) {  
    callback = once(callback || noop);
    if (!isArray(tasks)) return callback(new Error('First argument to waterfall must be an array of functions'));
    if (!tasks.length) return callback();
    var taskIndex = 0;

    function nextTask(args) {
        if (taskIndex === tasks.length) {
            return callback.apply(null, [null].concat(args));

        var taskCallback = onlyOnce(rest(function(err, args) {
            if (err) {
                return callback.apply(null, [err].concat(args));


        var task = tasks[taskIndex++];
        task.apply(null, args);


Essentially, a new callback is injected into the functions, and this is how async knows when a function is finished.

#2: Using co - generator based flow-control for Node.js

In case you wouldn't like to stick to the solid callback protocol, then co can be a good choice for you.

co is a generator based control flow tool for Node.js and the browser, using promises, letting you write non-blocking code in a nice-ish way.

co is a powerful alternative which takes advantage of generator functions tied with promises without the overhead of implementing custom iterators.

const fastPromise = new Promise((resolve, reject) => {  

const slowPromise = new Promise((resolve, reject) => {  

co(function * () {  
  yield fastPromise
  yield slowPromise
}).then(() => {

As for now, I suggest to go with co, since one of the most waited Node.js async/await functionality is only available in the nightly, unstable v7.x builds. But if you are already using Promises, switching from co to async function will be easy.

This syntactic sugar on top of Promises and Generators will eliminate the problem of callbacks and even help you to build nice flow control structures. Almost like writing synchronous code, right?

Stable Node.js branches will receive this update in the near future, so you will be able to remove co and just do the same.

Flow Control in Practice

As we have just learned several tools and tricks to handle async, it is time to do some practice with fundamental control flows to make our code more efficient and clean.

Let’s take an example and write a route handler for our web app, where the request can be resolved after 3 steps: validateParams, dbQuery and serviceCall.

If you'd like to write them without any helper, you'd most probably end up with something like this. Not so nice, right?

// validateParams, dbQuery, serviceCall are higher-order functions
function handler (done) {  
  validateParams((err) => {
    if (err) return done(err)
    dbQuery((err, dbResults) => {
      if (err) return done(err)
      serviceCall((err, serviceResults) => {
        done(err, { dbResults, serviceResults })

Instead of the callback-hell, we can use the async library to refactor our code, as we have already learned:

// validateParams, dbQuery, serviceCall are higher-order functions
function handler (done) {  
  async.waterfall([validateParams, dbQuery, serviceCall], done)

Let's take it a step further! Rewrite it to use Promises:

// validateParams, dbQuery, serviceCall are thunks
function handler () {  
  return validateParams()
    .then((result) => {
      return result

Also, you can use co powered generators with Promises:

// validateParams, dbQuery, serviceCall are thunks
const handler = co.wrap(function * () {  
  yield validateParams()
  const dbResults = yield dbQuery()
  const serviceResults = yield serviceCall()
  return { dbResults, serviceResults }

It feels like a "synchronous" code but still doing async jobs one after each other.

Lets see how this snippet should work with async / await.

// validateParams, dbQuery, serviceCall are thunks
async function handler () {  
  await validateParams()
  const dbResults = await dbQuery()
  const serviceResults = await serviceCall()
  return { dbResults, serviceResults }

Download the whole building with Node.js series as a single pdf

Takeaway rules for Node.js & Async

Fortunately, Node.js eliminates the complexities of writing thread-safe code. You just have to stick to these rules to keep things smooth:

  • As a rule of thumb, prefer async over sync API, because using a non-blocking approach gives superior performance over the synchronous scenario.

  • Always use the best fitting flow control or a mix of them in order reduce the time spent waiting for I/O to complete.

You can find all of the code from this article in this repository.

If you have any questions or suggestions for the article, please let me know in the comments!

In the next part of the Node.js at Scale series, we take a look at Event Sourcing with Examples.

Advanced Node.js Project Structure Tutorial

Advanced Node.js Project Structure Tutorial

Project structuring is an important topic because the way you bootstrap your application can determine the whole development experience throughout the life of the project.

In this Node.js project structure tutorial I’ll answer some of the most common questions we receive at RisingStack about structuring advanced Node applications, and help you with structuring a complex project.

These are the goals that we are aiming for:

  • Writing an application that is easy to scale and maintain.
  • The config is well separated from the business logic.
  • Our application can consist of multiple process types.

Node.js at Scale is a collection of articles focusing on the needs of companies with bigger Node.js installations and advanced Node developers. Chapters:

The Node.js Project Structure

Our example application is listening on Twitter tweets and tracks certain keywords. In case of a keyword match, the tweet will be sent to a RabbitMQ queue, which will be processed and saved to Redis. We will also have a REST API exposing the tweets we have saved.

You can take a look at the code on GitHub. The file structure for this project looks like the following:

|-- config
|   |-- components
|   |   |-- common.js
|   |   |-- logger.js
|   |   |-- rabbitmq.js
|   |   |-- redis.js
|   |   |-- server.js
|   |   `-- twitter.js
|   |-- index.js
|   |-- social-preprocessor-worker.js
|   |-- twitter-stream-worker.js
|   `-- web.js
|-- models
|   |-- redis
|   |   |-- index.js
|   |   `-- redis.js
|   |-- tortoise
|   |   |-- index.js
|   |   `-- tortoise.js
|   `-- twitter
|       |-- index.js
|       `-- twitter.js
|-- scripts
|-- test
|   `-- setup.js
|-- web
|   |-- middleware
|   |   |-- index.js
|   |   `-- parseQuery.js
|   |-- router
|   |   |-- api
|   |   |   |-- tweets
|   |   |   |   |-- get.js
|   |   |   |   |-- get.spec.js
|   |   |   |   `-- index.js
|   |   |   `-- index.js
|   |   `-- index.js
|   |-- index.js
|   `-- server.js
|-- worker
|   |-- social-preprocessor
|   |   |-- index.js
|   |   `-- worker.js
|   `-- twitter-stream
|       |-- index.js
|       `-- worker.js
|-- index.js
`-- package.json

In this example we have 3 processes:

  • twitter-stream-worker: The process is listening on Twitter for keywords and sends the tweets to a RabbitMQ queue.
  • social-preprocessor-worker: The process is listening on the RabbitMQ queue and saves the tweets to Redis and removes old ones.
  • web: The process is serving a REST API with a single endpoint: GET /api/v1/tweets?limit&offset.

We will get to what differentiates a web and a worker process, but let's start with the config.

Node.js Monitoring and Debugging from the Experts of RisingStack

Check your service dependencies in production using Trace
Learn more

How to handle different environments and configurations?

Load your deployment specific configurations from environment variables and never add them to the codebase as constants. These are the configurations that can vary between deployments and runtime environments, like CI, staging or production. Basically, you can have the same code running everywhere.

A good test for whether the config is correctly separated from the application internals is that the codebase could be made public at any moment. This means that you can be protected from accidentally leaking secrets or compromising credentials on version control.

Your config is correctly separated from the apps internals if the codebase could be made public at any moment.

Click To Tweet

The environment variables can be accessed via the process.env object. Keep in mind that all the values have a type of String, so you might need to use type conversions.

// config/config.js
'use strict'

// required environment variables
].forEach((name) => {
  if (!process.env[name]) {
    throw new Error(`Environment variable ${name} is missing`)

const config = {  
  env: process.env.NODE_ENV,
  logger: {
    level: process.env.LOG_LEVEL || 'info',
    enabled: process.env.BOOLEAN ? process.env.BOOLEAN.toLowerCase() === 'true' : false
  server: {
    port: Number(process.env.PORT)
  // ...

module.exports = config  

Config validation

Validating environment variables is also a quite useful technique. It can help you catching configuration errors on startup before your application does anything else. You can read more about the benefits of early error detection of configurations by Adrian Colyer in this blog post.

This is how our improved config file looks like with schema validation using the joi validator:

// config/config.js
'use strict'

const joi = require('joi')

const envVarsSchema = joi.object({  
  NODE_ENV: joi.string()
    .allow(['development', 'production', 'test', 'provision'])
  PORT: joi.number()
  LOGGER_LEVEL: joi.string()
    .allow(['error', 'warn', 'info', 'verbose', 'debug', 'silly'])
  LOGGER_ENABLED: joi.boolean()

const { error, value: envVars } = joi.validate(process.env, envVarsSchema)  
if (error) {  
  throw new Error(`Config validation error: ${error.message}`)

const config = {  
  env: envVars.NODE_ENV,
  isTest: envVars.NODE_ENV === 'test',
  isDevelopment: envVars.NODE_ENV === 'development',
  logger: {
    level: envVars.LOGGER_LEVEL,
    enabled: envVars.LOGGER_ENABLED
  server: {
    port: envVars.PORT
  // ...

module.exports = config  

Config splitting

Splitting the configuration by components can be a good solution to forego a single, growing config file.

// config/components/logger.js
'use strict'

const joi = require('joi')

const envVarsSchema = joi.object({  
  LOGGER_LEVEL: joi.string()
    .allow(['error', 'warn', 'info', 'verbose', 'debug', 'silly'])
  LOGGER_ENABLED: joi.boolean()

const { error, value: envVars } = joi.validate(process.env, envVarsSchema)  
if (error) {  
  throw new Error(`Config validation error: ${error.message}`)

const config = {  
  logger: {
    level: envVars.LOGGER_LEVEL,
    enabled: envVars.LOGGER_ENABLED

module.exports = config  

Then in the config.js file we only need to combine the components.

// config/config.js
'use strict'

const common = require('./components/common')  
const logger = require('./components/logger')  
const redis = require('./components/redis')  
const server = require('./components/server')

module.exports = Object.assign({}, common, logger, redis, server)  

You should never group your config together into "environment" specific files, like config/production.js for production. It doesn't scale well as your app expands into more deployments over time.

You should never group your config together into environment specific files. It doesn’t scale well! #nodejs

Click To Tweet

How to organize a multi-process application?

The process is the main building block of a modern application. An app can have multiple stateless processes, just like in our example. HTTP requests can be handled by a web process and long-running or scheduled background tasks by a worker. They are stateless, because any data that needs to be persisted is stored in a stateful database. For this reason, adding more concurrent processes are very simple. These processes can be independently scaled based on the load or other metrics.

In the previous section, we saw how to break down the config into components. This comes very handy when having different process types. Each type can have its own config only requiring the components it needs, without expecting unused environment variables.

In the config/index.js file:

// config/index.js
'use strict'

const processType = process.env.PROCESS_TYPE

let config  
try {  
  config = require(`./${processType}`)
} catch (ex) {
  if (ex.code === 'MODULE_NOT_FOUND') {
    throw new Error(`No config for process type: ${processType}`)

  throw ex

module.exports = config  

In the root index.js file we start the process selected with the PROCESS_TYPE environment variable:

// index.js
'use strict'

const processType = process.env.PROCESS_TYPE

if (processType === 'web') {  
} else if (processType === 'twitter-stream-worker') {
} else if (processType === 'social-preprocessor-worker') {
} else {
  throw new Error(`${processType} is an unsupported process type. Use one of: 'web', 'twitter-stream-worker', 'social-preprocessor-worker'!`)

The nice thing about this is that we still got one application, but we have managed to split it into multiple, independent processes. Each of them can be started and scaled individually, without influencing the other parts. You can achieve this without sacrificing your DRY codebase, because parts of the code, like the models, can be shared between the different processes.

How to organize your test files?

Place your test files next to the tested modules using some kind of naming convention, like <module_name>.spec.js and <module_name>.e2e.spec.js. Your tests should live together with the tested modules, keeping them in sync. It would be really hard to find and maintain the tests and the corresponding functionality when the test files are completely separated from the business logic.

Place your test files next to the tested modules using some kind of naming convention, like module_name.spec.js

Click To Tweet

A separated /test folder can hold all the additional test setup and utilities not used by the application itself.

Where to put your build and script files?

We tend to create a /scripts folder where we put our bash and node scripts for database synchronization, front-end builds and so on. This folder separates them from your application code and prevents you from putting too many script files into the root directory. List them in your npm scripts for easier usage.

Download the whole building with Node.js series as a single pdf


I hope you enjoyed this article on project structuring. I highly recommend to check out our previous article on the subject, where we laid out the 5 fundamentals of Node.js project structuring.

If you have any questions, please let me know in the comments. In the next chapter of the Node.js at Scale series, we’re going to dive deep into JavaScript clean coding. See you next week!

Node.js Garbage Collection Explained

Node.js Garbage Collection Explained

In this article, you are going to learn how Node.js garbage collection works, what happens in the background when you write code and how memory is freed up for you.

Ancient garbage collector in action

With Node.js at Scale we are creating a collection of articles focusing on the needs of companies with bigger Node.js installations, and developers who already learned the basics of Node.

Memory Management in Node.js Applications

Every application needs memory to work properly. Memory management provides ways to dynamically allocate memory chunks for programs when they request it, and free them when they are no longer needed - so that they can be reused.

Application-level memory management can be manual or automatic. The automatic memory management usually involves a garbage collector.

The following code snippet shows how memory can be allocated in C, using manual memory management:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {

   char name[20];
   char *description;

   strcpy(name, "RisingStack");

   // memory allocation
   description = malloc( 30 * sizeof(char) );

   if( description == NULL ) {
      fprintf(stderr, "Error - unable to allocate required memory\n");
   } else {
      strcpy( description, "Trace by RisingStack is an APM.");

   printf("Company name = %s\n", name );
   printf("Description: %s\n", description );

   // release memory

In manual memory management, it is the responsibility of the developer to free up the unused memory portions. Managing your memory this way can introduce several major bugs to your applications:

  • Memory leaks when the used memory space is never freed up.
  • Wild/dangling pointers appear when an object is deleted, but the pointer is reused. Serious security issues can be introduced when other data structures are overwritten or sensitive information is read.

Luckily for you, Node.js comes with a garbage collector, and you don't need to manually manage memory allocation.

Node.js Monitoring and Debugging from the Experts of RisingStack

Watch out for your garbage collector metrics with Trace
Learn more

The Concept of the Garbage Collector

Garbage collection is a way of managing application memory automatically. The job of the garbage collector (GC) is to reclaim memory occupied by unused objects (garbage). It was first used in LISP in 1959, invented by John McCarthy.

The way how the GC knows that objects are no longer in use is that no other object has references to them.

"A garbage collector was first used in LISP in 1959, invented by John McCarthy." via @RisingStack

Click To Tweet

Memory before the garbage collection

The following diagram shows how the memory can look like if you have objects with references to each other, and with some objects that have no reference to any objects. These are the objects that can be collected by a garbage collector run.

Memory state before Node.js garbage collection

Memory after the garbage collection

Once the garbage collector is run, the objects that are unreachable gets deleted, and the memory space is freed up.

Memory state after Node.js garbage collection

The Advantages of Using a Garbage Collector

  • it prevents wild/dangling pointers bugs,
  • it won't try to free up space that was already freed up,
  • it will protect you from some types of memory leaks.

Of course, using a garbage collector doesn't solve all of your problems, and it’s not a silver bullet for memory management. Let's take a look at things that you should keep in mind!

"Using a garbage collector doesn't solve all of your memory management problems with #nodejs!" via @RisingStack

Click To Tweet

Things to Keep in Mind When Using a Garbage Collector

  • performance impact - in order to decide what can be freed up, the GC consumes computing power
  • unpredictable stalls - modern GC implementations try to avoid "stop-the-world" collections

Node.js Garbage Collection & Memory Management in Practice

The easiest way of learning is by doing - so I am going to show you what happens in the memory with different code snippets.

The Stack

The stack contains local variables and pointers to objects on the heap or pointers defining the control flow of the application.

In the following example, both a and b will be placed on the stack.

function add (a, b) {  
  return a + b

add(4, 5)  

Need help with enterprise-grade Node.js Development?
Hire the experts of RisingStack!

The Heap

The heap is dedicated to store reference type objects, like strings or objects.

The Car object created in the following snippet is placed on the heap.

function Car (opts) { =

const LightningMcQueen = new Car({name: 'Lightning McQueen'})  

After this, the memory would look something like this:

Node.js Garbage Collection First Step - Object Placed in the Memory Heap

Let's add more cars, and see how our memory would look like!

function Car (opts) { =

const LightningMcQueen = new Car({name: 'Lightning McQueen'})  
const SallyCarrera = new Car({name: 'Sally Carrera'})  
const Mater = new Car({name: 'Mater'})  

Node.js Garbage Collection Second Step - More elements added to the heap

If the GC would run now, nothing could be freed up, as the root has a reference to every object.

Let's make it a little bit more interesting, and add some parts to our cars!

function Engine (power) {  
  this.power = power

function Car (opts) { =
  this.engine = new Engine(opts.power)

let LightningMcQueen = new Car({name: 'Lightning McQueen', power: 900})  
let SallyCarrera = new Car({name: 'Sally Carrera', power: 500})  
let Mater = new Car({name: 'Mater', power: 100})  

Node.js Garbage Collection - Assigning values to the objects in the heap

What would happen, if we no longer use Mater, but redefine it and assign some other value, like Mater = undefined?

Node.js Garbage Collection - Redefining values

As a result, the original Mater object cannot be reached from the root object, so on the next garbage collector run it will be freed up:

Node.js Garbage Collection - Freeing up the unreachable object

Now as we understand the basics of what's the expected behaviour of the garbage collector, let's take a look on how it is implemented in V8!

Garbage Collection Methods

In one of our previous articles we dealt with how the Node.js garbage collection methods work, so I strongly recommend reading that article.

Here are the most important things you’ll learn there:

New Space and Old Space

The heap has two main segments, the New Space and the Old Space. The New Space is where new allocations are happening; it is fast to collect garbage here and has a size of ~1-8MBs. Objects living in the New Space are called Young Generation.

The Old Space where the objects that survived the collector in the New Space are promoted into - they are called the Old Generation. Allocation in the Old Space is fast, however collection is expensive so it is infrequently performed .

Young Generation

Usually, ~20% of the Young Generation survives into the Old Generation. Collection in the Old Space will only commence once it is getting exhausted. To do so the V8 engine uses two different collection algorithms.

Scavenge and Mark-Sweep collection

Scavenge collection is fast and runs on the Young Generation, however the slower Mark-Sweep collection runs on the Old Generation.

A Real-Life Example - The Meteor Case-Study

In 2013, the creators of Meteor announced their findings about a memory leak they ran into. The problematic code snippet was the following:

var theThing = null  
var replaceThing = function () {  
  var originalThing = theThing
  var unused = function () {
    if (originalThing)
  theThing = {
    longStr: new Array(1000000).join('*'),
    someMethod: function () {
setInterval(replaceThing, 1000)  

Well, the typical way that closures are implemented is that every function object has a link to a dictionary-style object representing its lexical scope. If both functions defined inside replaceThing actually used originalThing, it would be important that they both get the same object, even if originalThing gets assigned to over and over, so both functions share the same lexical environment. Now, Chrome's V8 JavaScript engine is apparently smart enough to keep variables out of the lexical environment if they aren't used by any closures - from the Meteor blog.

Further reading:

Next up

In the next chapter of the Node.js at Scale tutorial series we will take a deep dive into writing native Node.js module.

In the meantime, let us know in the comments sections if you have any questions!

Understanding the Node.js Event Loop

Understanding the Node.js Event Loop

This article helps you to understand how the Node.js event loop works, and how you can leverage it to build fast applications. We’ll also discuss the most common problems you might encounter, and the solutions for them.

With Node.js at Scale we are creating a collection of articles focusing on the needs of companies with bigger Node.js installations, and developers who already learned the basics of Node.

The problem

Most of the backends behind websites don’t need to do complicated computations. Our programs spend most of their time waiting for the disk to read & write , or waiting for the wire to transmit our message and send back the answer.

IO operations can be orders of magnitude slower than data processing. Take this for example: SSD-s can have a read speed of 200-730 MB/s - at least a high-end one. Reading just one kilobyte of data would take 1.4 microseconds, but during this time a CPU clocked at 2GHz could have performed 28 000 of instruction-processing cycles.

For network communications it can be even worse, just try and ping

$ ping
64 bytes from icmp_seq=0 ttl=52 time=33.017 ms  
64 bytes from icmp_seq=1 ttl=52 time=83.376 ms  
64 bytes from icmp_seq=2 ttl=52 time=26.552 ms  
64 bytes from icmp_seq=3 ttl=52 time=40.153 ms  
64 bytes from icmp_seq=4 ttl=52 time=37.291 ms  
64 bytes from icmp_seq=5 ttl=52 time=58.692 ms  
64 bytes from icmp_seq=6 ttl=52 time=45.245 ms  
64 bytes from icmp_seq=7 ttl=52 time=27.846 ms  

The average latency is about 44 milliseconds. Just while waiting for a packet to make a round-trip on the wire, the previously mentioned processor can perform 88 millions of cycles.

Node.js Monitoring and Debugging from the Experts of RisingStack

Watch out for your event loop metrics with Trace!
Learn more

The solution

Most operational systems provide some kind of an Asynchronous IO interface, which allows you to start processing data that does not require the result of the communication, meanwhile the communication still goes on..

This can be achieved in several ways. Nowadays it is mostly done by leveraging the possibilities of multithreading at the cost of extra software complexity. For example reading a file in Java or Python is a blocking operation. Your program cannot do anything else while it is waiting for the network / disk communication to finish. All you can do - at least in Java - is to fire up a different thread then notify your main thread when the operation has finished.

It is tedious, complicated, but gets the job done. But what about Node? Well, we are surely facing some problems as Node.js - or more like V8 - is single-threaded. Our code can only run in one thread.

EDIT: This is not entirely true. Both Java and Python have async interfaces, but using them is definitely more difficult than in Node.js. Thanks to Shahar and Dirk Harrington for pointing this out.

You might have heard that in a browser, setting setTimeout(someFunction, 0) can sometimes fix things magically. But why does setting a timeout to 0, deferring execution by 0 milliseconds fix anything? Isn’t it the same as simply calling someFunction immediately? Not really.

First of all, let's take a look at the call stack, or simply, “stack”. I am going to make things simple, as we only need to understand the very basics of the call stack. In case you are familiar how it works, feel free to jump to the next section.


Whenever you call a functions return address, parameters and local variables will be pushed to the stack. If you call another function from the currently running function, its contents will be pushed on top in the same manner as the previous one - with its return address.

For the sake of simplicity I will say that 'a function is pushed' to the top of the stack from now on, even though it is not exactly correct.

Let's take a look!

 1 function main () {
 2   const hypotenuse = getLengthOfHypotenuse(3, 4)
 3   console.log(hypotenuse)
 4 }
 6 function getLengthOfHypotenuse(a, b) {
 7   const squareA = square(a)
 8   const squareB = square(b)
 9   const sumOfSquares = squareA + squareB
10   return Math.sqrt(sumOfSquares)  
11 }  
13 function square(number) {  
14   return number * number  
15 }  
17 main()  

main is called first:

The main function

then main calls getLengthOfHypotenuse with 3 and 4 as arguments

The getLengthOfHypotenuse function

afterwards square is with the value of a

The square(a) function

when square returns, it is popped from the stack, and its return value is assigned to squareA. squareA is added to the stack frame of getLengthOfHypotenuse

Variable a

same goes for the next call to square

The square(b) function

Variable b

in the next line the expression squareA + squareB is evaluated


then Math.sqrt is called with sumOfSquares


now all is left for getLengthOfHypotenuse is to return the final value of its calculation

The return function

the returned value gets assigned to hypotenuse in main


the value of hypotenuse is logged to console

The console log

finally, main returns without any value, gets popped from the stack leaving it empty


SIDE NOTE: You saw that local variables are popped from the stack when the functions execution finishes. It happens only when you work with simple values such as numbers, strings and booleans. Values of objects, arrays and such are stored in the heap and your variable is merely a pointer to them. If you pass on this variable, you will only pass the said pointer, making these values mutable in different stack frames. When the function is popped from the stack, only the pointer to the Object gets popped with leaving the actual value in the heap. The garbage collector is the guy who takes care of freeing up space once the objects outlived their usefulness.

Enter Node.js Event Loop

The Node.js Event Loop - cat version

No, not this loop. :)

So what happens when we call something like setTimeout, http.get, process.nextTick, or fs.readFile? Neither of these things can be found in V8's code, but they are available in the Chrome WebApi and the C++ API in case of Node.js. To understand this, we will have to understand the order of execution a little bit better.

Let's take a look at a more common Node.js application - a server listening on localhost:3000/. Upon getting a request, the server will call<city> to get the weather, print some kind messages to the console, and it forwards responses to the caller after recieving them.

'use strict'  
const express = require('express')  
const superagent = require('superagent')  
const app = express()

app.get('/', sendWeatherOfRandomCity)

function sendWeatherOfRandomCity (request, response) {  
  getWeatherOfRandomCity(request, response)

const CITIES = [  

function getWeatherOfRandomCity (request, response) {  
  const city = CITIES[Math.floor(Math.random() * CITIES.length)]
    .end((err, res) => {
      if (err) {
        console.log('O snap')
        return response.status(500).send('There was an error getting the weather, try looking out the window')
      const responseText = res.text
      console.log('Got the weather')

  console.log('Fetching the weather, please be patient')

function sayHi () {  


What will be printed out aside from getting the weather when a request is sent to localhost:3000?

If you have some experience with Node, you shouldn't be surprised that even though console.log('Fetching the weather, please be patient') is called after console.log('Got the weather') in the code, the former will print first resulting in:

Fetching the weather, please be patient  
Got the weather  

What happened? Even though V8 is single-threaded, the underlying C++ API of Node isn't. It means that whenever we call something that is a non-blocking operation, Node will call some code that will run concurrently with our javascript code under the hood. Once this hiding thread receives the value it awaits for or throws an error, the provided callback will be called with the necessary parameters.

SIDE NOTE: The ‘some code’ we mentioned is actually part of libuv. libuv is the open source library that handles the thread-pool, doing signaling and all other magic that is needed to make the asynchronous tasks work. It was originally developed for Node.js but a lot of other projects use of it by now.

Need help with enterprise-grade Node.js Development?
Hire the experts of RisingStack!

To peek under the hood, we need to introduce two new concepts: the event loop and the task queue.

Task queue

Javascript is a single-threaded, event-driven language. This means that we can attach listeners to events, and when a said event fires, the listener executes the callback we provided.

Whenever you call setTimeout, http.get or fs.readFile, Node.js sends these operations to a different thread allowing V8 to keep executing our code. Node also calls the callback when the counter has run down or the IO / http operation has finished.

These callbacks can enqueue other tasks and those functions can enqueue others and so on. This way you can read a file while processing a request in your server, and then make an http call based on the read contents without blocking other requests from being handled.

"#nodejs sends IO operations to different threads so #v8 can keep executing our code" via @RisingStack #javascript

Click To Tweet

However, we only have one main thread and one call-stack, so in case there is another request being served when the said file is read, its callback will need to wait for the stack to become empty. The limbo where callbacks are waiting for their turn to be executed is called the task queue (or event queue, or message queue). Callbacks are being called in an infinite loop whenever the main thread has finished its previous task, hence the name 'event loop'.

In our previous example it would look something like this:

  1. express registers a handler for the 'request' event that will be called when request arrives to '/'
  2. skips the functions and starts listening on port 3000
  3. the stack is empty, waiting for 'request' event to fire
  4. upon incoming request, the long awaited event fires, express calls the provided handler sendWeatherOfRandomCity
  5. sendWeatherOfRandomCity is pushed to the stack
  6. getWeatherOfRandomCity is called and pushed to the stack
  7. Math.floor and Math.random are called, pushed to the stack and popped, a from cities is assigned to city
  8. superagent.get is called with '${city}', the handler is set for the end event.
  9. the http request to${city} is send to a background thread, and the execution continues
  10. 'Fetching the weather, please be patient' is logged to the console, getWeatherOfRandomCity returns
  11. sayHi is called, 'Hi' is printed to the console
  12. sendWeatherOfRandomCity returns, gets popped from the stack leaving it empty
  13. waiting for${city} to send it's response
  14. once the response has arrived, the end event is fired.
  15. the anonymous handler we passed to .end() is called, gets pushed to the stack with all variables in its closure, meaning it can see and modify the values of express, superagent, app, CITIES, request, response, city and all the functions we have defined
  16. response.send() gets called either with 200 or 500 statusCode, but again it is sent to a background thread, so the response stream is not blocking our execution, anonymous handler is popped from the stack.

So now we can understand why the previously mentioned setTimeout hack works. Even though we set the counter to zero, it defers the execution until the current stack and the task queue is empty, allowing the browser to redraw the UI, or Node to serve other requests.

Microtasks and Macrotasks

If this wasn't enough, we actually have more then one task queue. One for microtasks and another for macrotasks.

examples of microtasks:

  • process.nextTick
  • promises
  • Object.observe

examples of macrotasks:

  • setTimeout
  • setInterval
  • setImmediate
  • I/O

Let's take a look at the following code:

console.log('script start')

const interval = setInterval(() => {  
}, 0)

setTimeout(() => {  
  console.log('setTimeout 1')
  Promise.resolve().then(() => {
    console.log('promise 3')
  }).then(() => {
    console.log('promise 4')
  }).then(() => {
    setTimeout(() => {
      console.log('setTimeout 2')
      Promise.resolve().then(() => {
        console.log('promise 5')
      }).then(() => {
        console.log('promise 6')
      }).then(() => {
    }, 0)
}, 0)

Promise.resolve().then(() => {  
  console.log('promise 1')
}).then(() => {
  console.log('promise 2')

this will log to the console:

script start  

According to the WHATVG specification, exactly one (macro)task should get processed from the macrotask queue in one cycle of the event loop. After said macrotask has finished, all of the available microtasks will be processed within the same cycle. While these microtasks are being processed, they can queue more microtasks, which will all be run one by one, until the microtask queue is exhausted.

This diagram tries to make the picture a bit clearer:

The Node.js Event Loop

In our case:

Cycle 1:

  1. `setInterval` is scheduled as task
  2. `setTimeout 1` is scheduled as task
  3. in `Promise.resolve 1` both `then`s are scheduled as microtasks
  4. the stack is empty, microtasks are run

Task queue: setInterval, setTimeout 1

Cycle 2:

  1. the microtask queue is empty, `setInteval`'s handler can be run, another `setInterval` is scheduled as a task, right behind `setTimeout 1`

Task queue: setTimeout 1, setInterval

Cycle 3:

  1. the microtask queue is empty, `setTimeout 1`'s handler can be run, `promise 3` and `promise 4` are scheduled as microtasks,
  2. handlers of `promise 3` and `promise 4` are run `setTimeout 2` is scheduled as task

Task queue: setInterval, setTimeout 2

Cycle 4:

  1. the microtask queue is empty, `setInteval`'s handler can be run, another `setInterval` is scheduled as a task, right behind `setTimeout`

Task queue: setTimeout 2, setInteval

  1. `setTimeout 2`'s handler run, `promise 5` and `promise 6` are scheduled as microtasks

Now handlers of promise 5 and promise 6 should be run clearing our interval, but for some strange reason setInterval is run again. However, if you run this code in Chrome, you will get the expected behavior.

We can fix this in Node too with process.nextTick and some mind-boggling callback hell.

console.log('script start')

const interval = setInterval(() => {  
}, 0)

setTimeout(() => {  
  console.log('setTimeout 1')
  process.nextTick(() => {
    console.log('nextTick 3')
    process.nextTick(() => {
      console.log('nextTick 4')
      setTimeout(() => {
        console.log('setTimeout 2')
        process.nextTick(() => {
          console.log('nextTick 5')
          process.nextTick(() => {
            console.log('nextTick 6')
      }, 0)

process.nextTick(() => {  
  console.log('nextTick 1')
  process.nextTick(() => {
    console.log('nextTick 2')

This is the exact same logic as our beloved promises use, only a little bit more hideous. At least it gets the job done the way we expected.

Download the whole Node.js Under the Hood tutorial series and read it later

Tame the async beast!

As we saw, we need to manage and pay attention to both task queues, and to the event loop when we write an app in Node.js - in case we wish to leverage all its power, and if we want to keep our long running tasks from blocking the main thread.

The event loop might be a slippery concept to grasp at first, but once you get the hang of it, you won't be able to imagine that there is life without it. The continuation passing style that can lead to a callback hell might look ugly, but we have Promises, and soon we will have async-await in our hands... and while we are (a)waiting, you can simulate async-await using co and/or koa.

One last parting advice:

Knowing how Node.js and V8 handles long running executions, you can start using it for your own good. You might have heard before that you should send your long running loops to the task queue. You can do it by hand or make use of async.js.

Happy coding!

If you have any questions or thoughts, share them in the comments, I’ll be there! The next part of the Node.js at Scale series is discussing the Garbage Collection in Node.js, I recommend to check it out!

How the module system, CommonJS & require works

How the module system, CommonJS & require works

In the third chapter of Node.js at Scale you are about to learn how the Node.js module system & CommonJS works and what does require do under the hood.

With Node.js at Scale we are creating a collection of articles focusing on the needs of companies with bigger Node.js installations, and developers who already learned the basics of Node.

CommonJS to the rescue

The JavaScript language didn’t have a native way of organizing code before the ES2015 standard. Node.js filled this gap with the CommonJS module format. In this article we will learn about how the Node.js module system works, how you can organize your modules and what does the new ES standard means for the future of Node.js.

"#JavaScript didn't have a mature module system before #nodejs. That gap was filled with #commonjs" via @RisingStack

Click To Tweet

What is the module system?

Modules are the fundamental building blocks of the code structure. The module system allows you to organize your code, hide information and only expose the public interface of a component using module.exports. Every time you use the require call, you are loading another module.

The simplest example can be the following using CommonJS:

// add.js
function add (a, b) {  
  return a + b

module.exports = add  

To use the add module we have just created, we have to require it.

// index.js
const add = require('./add')

console.log(add(4, 5))  

Under the hood, add.js is wrapped by Node.js this way:

(function (exports, require, module, __filename, __dirname) {
  function add (a, b) {
    return a + b

  module.exports = add

This is why you can access the global-like variables like require and module. It also ensures that your variables are scoped to your module rather than the global object.

"Modules are the fundamental building blocks of the code structure." via @RisingStack #nodejs

Click To Tweet

How does require work?

The module loading mechanism in Node.js is caching the modules on the first require call. It means that every time you use require('awesome-module') you will get the same instance of awesome-module, which ensures that the modules are singleton-like and have the same state across your application.

You can load native modules and path references from your file system or installed modules. If the identifier passed to the require function is not a native module or a file reference (beginning with /, ../, ./ or similar), then Node.js will look for installed modules. It will walk your file system looking for the referenced module in the node_modules folder. It starts from the parent directory of your current module and then moves to the parent directory until it finds the right module or until the root of the file system is reached.

Node.js Monitoring and Debugging from the Experts of RisingStack

Build performant applications using Trace
Learn more
Require under the hood - module.js

The module dealing with module loading in the Node core is called module.js, and can be found in lib/module.js in the Node.js repository.

The most important functions to check here are the _load and _compile functions.


This function checks whether the module is in the cache already - if so, it returns the exports object.

If the module is native, it calls the NativeModule.require() with the filename and returns the result.

Otherwise, it creates a new module for the file and saves it to the cache. Then it loads the file contents before returning its exports object.


The compile function runs the file contents in the correct scope or sandbox, as well as exposes helper variables like require, module or exports to the file.

How require works in Node.js How Require Works - From James N. Snell

How to organize the code?

In our applications, we need to find the right balance of cohesion and coupling when creating modules. The desirable scenario is to achieve high cohesion and loose coupling of the modules.

A module must be focused only on a single part of the functionality to have high cohesion. Loose coupling means that the modules should not have a global or shared state. They should only communicate by passing parameters, and they are easily replaceable without touching your broader codebase.

"The desirable scenario is to achieve high cohesion and loose coupling of the modules." via @RisingStack #nodejs

Click To Tweet

We usually export named functions or constants in the following way:

'use strict'


function connect () { /* ... */ }

module.exports = {  

What’s in your node_modules?

The node_modules folder is the place where Node.js looks for modules. npm v2 and npm v3 install your dependencies differently. You can find out what version of npm you are using by executing:

npm --version  

npm v2

npm 2 installs all dependencies in a nested way, where your primary package dependencies are in their node_modules folder.

npm v3

npm3 attempts to flatten these secondary dependencies and install them in the root node_modules folder. This means that you can’t tell by looking at your node_modules which packages are your explicit or implicit dependencies. It is also possible that the installation order changes your folder structure because npm 3 is non-deterministic in this manner.

You can make sure that your node_modules directory is always the same by installing packages only from a package.json. In this case, it installs your dependencies in alphabetical order, which also means that you will get the same folder tree. This is important because the modules are cached using their path as the lookup key. Each package can have its own child node_modules folder, which might result in multiple instances of the same package and of the same module.

How to handle your modules?

There are two main ways for wiring modules. One of them is using hard coded dependencies, explicitly loading one module into another using a require call. The other method is to use a dependency injection pattern, where we pass the components as a parameter or we have a global container (known as IoC, or Inversion of Control container), which centralizes the management of the modules.

We can allow Node.js to manage the modules life cycle by using hard coded module loading. It organizes your packages in an intuitive way, which makes understanding and debugging easy.

Dependency Injection is rarely used in a Node.js environment, although it is a useful concept. The DI pattern can result in an improved decoupling of the modules. Instead of explicitly defining dependencies for a module, they are received from the outside. Therefore they can be easily replaced with modules having the same interfaces.

Let’s see an example for DI modules using the factory pattern:

class Car {  
  constructor (options) {
    this.engine = options.engine

  start () {

function create (options) {  
  return new Car(options)

module.exports = create  

The ES2015 module system

As we saw above, the CommonJS module system uses a runtime evaluation of the modules, wrapping them into a function before the execution. The ES2015 modules don’t need to be wrapped since the import/export bindings are created before evaluating the module. This incompatibility is the reason that currently there are no JavaScript runtime supporting the ES modules. There was a lot of discussion about the topic and a proposal is in DRAFT state, so hopefully we will have support for it in future Node versions.

To read an in-depth explanation of the biggest differences between CommonJS and the ESM, read the following article by James M Snell.

Download the whole Learn using npm series as a single pdf

Next up

I hope this article contained valuable information about the module system and how require works. If you have any questions or insights on the topic, please share them in the comments. In the next chapter of the Node.js at Scale series, we are going to take a deep dive and learn about the event loop.

npm Publishing Tutorial

npm Publishing Tutorial

In the second chapter of Node.js at Scale you are going to learn how to expand the npm registry with your own modules. This tutorial is also going to explain how versioning works.

With Node.js at Scale we are creating a collection of articles focusing on the needs of companies with bigger Node.js installations, and developers who already learned the basics of Node.

npm Module Publishing

When writing Node.js apps, there are so many things on npm that can help us being more productive. We don't have to deal with low-level things like padding a string from the left because there are already existing modules that are (eventually) available on the npm registry.

Where do these modules come from?

The modules are stored in a huge registry which is powered by a CouchDB instance.

The official public npm registry is at It is powered by a CouchDB database, which has a public mirror at The code for the couchapp is available at

How do modules make it to the registry?

People like you write them for themselves or for their co-workers and they share the code with their fellow JavaScript developers.

When should I consider publishing?

  • If you want to share code between projects,
  • if you think that others might run into the very same problem and you'd like to help them,
  • if you have a bit (or even more) code that you think you can make use of later.

Creating a module

First let's create a module: npm init -y should take care of it, as you've learned in the previous post.

  "name": "npm-publishing",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  "keywords": [],
  "author": "",
  "repository": {
    "type": "git",
    "url": "git+"
  "bugs": {
    "url": ""
  "license": "ISC"

Let's break this down really quick. These fields in your package.json are mandatory when you're building a module for others to use.

First, you should give your module a distinct name because it has to be unique in the npm registry. Make sure it does not collide with any trademarks out there! main describes which file will be returned when your users do a require('modulename'). You can leave it as default or set it to any file in your project, but make sure you actually point it to a valid filename.

keywords should also be included because npm is going to index your package based on those fields and people will be able to find your module if they search those keywords in npm's search, or in any third party npm search site.

author, well obviously that's going to be you, but if anyone helps you develop your project be so kind to include them too! :) Also, it is very important to include where can people contact you if they'd like to.

In the repository field, you can see where the code is hosted and the bugs section tells you where can you file bugs if you find one in the package. To quickly jump to the bug report site you can use npm bug modulename.

Node.js Monitoring and Debugging from the Experts of RisingStack

Build performant applications using Trace
Learn more

#1 Licensing

Solid license and licenses adoption helps Node adoption by large companies. Code is a valuable resource, and sharing it has it's own costs.

Licensing is a really hard, but this site can help you pick one that fits your needs.

Generally when people publish modules to npm they use the MIT license.

The MIT License is a permissive free software license originating at the Massachusetts Institute of Technology (MIT). As a permissive license, it puts only very limited restriction on reuse and has therefore an excellent license compatibility.

#2 Semantic Versioning

Versioning is so important that it deserves its own section.

Most of the modules in the npm registry follow the specification called semantic versioning. Semantic versioning describes the version of a software as 3 numbers separated by "."-s. It describes how this version number has to change when changes are made to the software itself.

Given a version number MAJOR.MINOR.PATCH, increment the:

  • MAJOR version when you make incompatible API changes,
  • MINOR version when you add functionality in a backwards-compatible manner, and
  • PATCH version when you make backwards-compatible bug fixes.

Additional labels for the pre-release and the build metadata are available as extensions to the MAJOR.MINOR.PATCH format.

These numbers are for machines, not for humans! Don't assume that people will be discouraged from using your libraries when you often change the major version.

"If you break your API, think about your users and BUMP THAT MAJOR!" via @RisingStack #nodejs #semver

Click To Tweet

You have to start versioning at 1.0!

Most people think that doing changes while the software is still in "beta" phase should not respect the semantic versioning. They are wrong! It is really important to communicate breaking changes to your users even in beta phase. Always think about your users who want to experiment with your project.

#3 Documentation

Having a proper documentation is imperative if you’d like to share your code with others. Putting a file in your project’s root folder is usually enough, and if you publish it to the registry npm will generate a site like this one. It's all done automatically and it helps other people when they try to use your code.

Before publishing, make sure you have all documentation in place and up to date.

#4 Keeping secret files out of your package

Using a specific file called .npmignore will keep your secret or private files from publishing. Use that to your advantage, add files to .npmignore that you wish to not upload.

If you use .gitignore npm will use that too by default. Like git, npm looks for .npmignore and .gitignore files in all subdirectories of your package, not only in the root directory.

#5 Encouraging contributions

When you open up your code to the public, you should consider adding some guidelines for them on how to contribute. Make sure they know how to help you dealing with software bugs and adding new features to your module.

There are a few of these available, but in general you should consider using github's issue and pull-request templates.

npm publish

Now you understand everything that's necessary to publish your first module. To do so, you can type: npm publish and the npm-cli will upload the code to the registry.

Congratulations, your module is now public on the npm registry! Visit for the public URL.

If you published something public to npm, it's going to stay there forever. There is little you can do to make it non-discoverable. Once it hits the public registry, every other replica that's connected to it will copy all the data. Be careful when publishing.

I published something that I didn't mean to.

We're human. We make mistakes, but what can be done now? Since the recent leftpad scandal, npm changed the unpublish policy. If there is no package on the registry that depends on your package, then you're fine to unpublish it, but remember all the replicas will copy all the data so someone somewhere will always be able to get it. If it contained any secrets, make sure you change them after the act, and remember to add them to the .npmignore file for the next publish.

"If you accidentally published secrets to #npm change & add them to the .npmignore file!" via @RisingStack #nodejs

Click To Tweet

Private Scoped Packages

If you don't want or you're not allowed to publish code to a public registry (for any corporate reasons), npm allows organizations to open an organization account so that they can push to the registry without being public. This way you can share private code between you and your co-workers.

Further read on how to set it up:

npm enterprise

If you'd like to further tighten your security by running a registry by yourself, you can do that pretty easily. npm has an on-premise version that can be run behind corporate firewalls. Read more about setting up npm enterprise.

Download the whole Learn using npm series as a single pdf

Build something!

Now that you know all these things, go and build something. If you’re up for a little bragging, make sure you tweet us (@risingstack) the name of the package this tutorial helped you to build! If you have any questions, you’ll find me in the comments.

Happy publishing!

In the next part of the Node.js at Scale series, you're going to learn about the Node.js module system and require.