Trace Node.js Monitoring
Need help with Node.js?
Learn more
Need Node.js support?
Learn more

async programming

Node.js Async Best Practices & Avoiding Callback Hell - Node.js at Scale

Node.js Async Best Practices & Avoiding Callback Hell - Node.js at Scale

In this post, we cover what tools and techniques you have at your disposal when handling Node.js asynchronous operations: async.js, promises, generators and async functions.

After reading this article, you’ll know how to avoid the despised callback hell!

Node.js at Scale is a collection of articles focusing on the needs of companies with bigger Node.js installations and advanced Node developers. Chapters:

Asynchronous programming in Node.js

Previously we have gathered a strong knowledge about asynchronous programming in JavaScript and understood how the Node.js event loop works.

If you did not read these articles, I highly recommend them as introductions!

The Problem with Node.js Async

Node.js itself is single threaded, but some tasks can run parallelly - thanks to its asynchronous nature.

But what does running parallelly mean in practice?

Since we program a single threaded VM, it is essential that we do not block execution by waiting for I/O, but handle them concurrently with the help of Node.js's event driven APIs.

Let’s take a look at some fundamental patterns, and learn how we can write resource efficient, non-blocking code, with the built-in solutions of Node.js and some third-party libraries.

The Classical Approach - Callbacks

Let's take a look at these simple async operations. They do nothing special, just fire a timer and call a function once the timer finished.

function fastFunction (done) {  
  setTimeout(function () {
    done()
  }, 100)
}

function slowFunction (done) {  
  setTimeout(function () {
    done()
  }, 300)
}

Seems easy, right?

Our higher-order functions can be executed sequentially or parallelly with the basic "pattern" by nesting callbacks - but using this method can lead to an untameable callback-hell.

function runSequentially (callback) {  
  fastFunction((err, data) => {
    if (err) return callback(err)
    console.log(data)   // results of a

    slowFunction((err, data) => {
      if (err) return callback(err)
      console.log(data) // results of b

      // here you can continue running more tasks
    })
  })
}


Avoiding Callback Hell with Control Flow Managers

To become an efficient Node.js developer, you have to avoid the constantly growing indentation level, produce clean and readable code and be able to handle complex flows.

Let me show you some of the libraries we can use to organize our code in a nice and maintainable way!


Node.js Monitoring and Debugging from the Experts of RisingStack

Concurrency issues in production? Trace can help!
Learn more


#1: Meet the Async Module

Async is a utility module which provides straight-forward, powerful functions for working with asynchronous JavaScript.

Async contains some common patterns for asynchronous flow control with the respect of error-first callbacks.

Let's see how our previous example would look like using async!

async.waterfall([fastFunction, slowFunction], () => {  
  console.log('done')
})

What kind of witchcraft just happened?

Actually, there is no magic to reveal. You can easily implement your async job-runner which can run tasks parallelly and wait for each to be ready.

Let's take a look at what async does under the hood!

// taken from https://github.com/caolan/async/blob/master/lib/waterfall.js
function(tasks, callback) {  
    callback = once(callback || noop);
    if (!isArray(tasks)) return callback(new Error('First argument to waterfall must be an array of functions'));
    if (!tasks.length) return callback();
    var taskIndex = 0;

    function nextTask(args) {
        if (taskIndex === tasks.length) {
            return callback.apply(null, [null].concat(args));
        }

        var taskCallback = onlyOnce(rest(function(err, args) {
            if (err) {
                return callback.apply(null, [err].concat(args));
            }
            nextTask(args);
        }));

        args.push(taskCallback);

        var task = tasks[taskIndex++];
        task.apply(null, args);
    }

    nextTask([]);
}

Essentially, a new callback is injected into the functions, and this is how async knows when a function is finished.

#2: Using co - generator based flow-control for Node.js

In case you wouldn't like to stick to the solid callback protocol, then co can be a good choice for you.

co is a generator based control flow tool for Node.js and the browser, using promises, letting you write non-blocking code in a nice-ish way.

co is a powerful alternative which takes advantage of generator functions tied with promises without the overhead of implementing custom iterators.

const fastPromise = new Promise((resolve, reject) => {  
  fastFunction(resolve)
})

const slowPromise = new Promise((resolve, reject) => {  
  slowFunction(resolve)
})

co(function * () {  
  yield fastPromise
  yield slowPromise
}).then(() => {
  console.log('done')
})

As for now, I suggest to go with co, since one of the most waited Node.js async/await functionality is only available in the nightly, unstable v7.x builds. But if you are already using Promises, switching from co to async function will be easy.

This syntactic sugar on top of Promises and Generators will eliminate the problem of callbacks and even help you to build nice flow control structures. Almost like writing synchronous code, right?

Stable Node.js branches will receive this update in the near future, so you will be able to remove co and just do the same.

Flow Control in Practice

As we have just learned several tools and tricks to handle async, it is time to do some practice with fundamental control flows to make our code more efficient and clean.

Let’s take an example and write a route handler for our web app, where the request can be resolved after 3 steps: validateParams, dbQuery and serviceCall.

If you'd like to write them without any helper, you'd most probably end up with something like this. Not so nice, right?

// validateParams, dbQuery, serviceCall are higher-order functions
// DONT
function handler (done) {  
  validateParams((err) => {
    if (err) return done(err)
    dbQuery((err, dbResults) => {
      if (err) return done(err)
      serviceCall((err, serviceResults) => {
        done(err, { dbResults, serviceResults })
      })
    })
  })
}

Instead of the callback-hell, we can use the async library to refactor our code, as we have already learned:

// validateParams, dbQuery, serviceCall are higher-order functions
function handler (done) {  
  async.waterfall([validateParams, dbQuery, serviceCall], done)
}

Let's take it a step further! Rewrite it to use Promises:

// validateParams, dbQuery, serviceCall are thunks
function handler () {  
  return validateParams()
    .then(dbQuery)
    .then(serviceCall)
    .then((result) => {
      console.log(result)
      return result
    })
}

Also, you can use co powered generators with Promises:

// validateParams, dbQuery, serviceCall are thunks
const handler = co.wrap(function * () {  
  yield validateParams()
  const dbResults = yield dbQuery()
  const serviceResults = yield serviceCall()
  return { dbResults, serviceResults }
})

It feels like a "synchronous" code but still doing async jobs one after each other.

Lets see how this snippet should work with async / await.

// validateParams, dbQuery, serviceCall are thunks
async function handler () {  
  await validateParams()
  const dbResults = await dbQuery()
  const serviceResults = await serviceCall()
  return { dbResults, serviceResults }
})

Download the whole building with Node.js series as a single pdf

Takeaway rules for Node.js & Async

Fortunately, Node.js eliminates the complexities of writing thread-safe code. You just have to stick to these rules to keep things smooth:

  • As a rule of thumb, prefer async over sync API, because using a non-blocking approach gives superior performance over the synchronous scenario.

  • Always use the best fitting flow control or a mix of them in order reduce the time spent waiting for I/O to complete.

You can find all of the code from this article in this repository.

If you have any questions or suggestions for the article, please let me know in the comments!

In the next part of the Node.js at Scale series, we take a look at Event Sourcing with Examples.

Node Hero - Understanding Async Programming in Node.js

Node Hero - Understanding Async Programming in Node.js

This is the third post of the tutorial series called Node Hero - in these chapters you can learn how to get started with Node.js and deliver software products using it.

In this chapter, I’ll guide you through async programming principles, and show you how to do async in JavaScript and Node.js.

Synchronous Programming

In traditional programming practice, most I/O operations happen synchronously. If you think about Java, and about how you would read a file using Java, you would end up with something like this:

try(FileInputStream inputStream = new FileInputStream("foo.txt")) {  
    Session IOUtils;
    String fileContent = IOUtils.toString(inputStream);
}

What happens in the background? The main thread will be blocked until the file is read, which means that nothing else can be done in the meantime. To solve this problem and utilize your CPU better, you would have to manage threads manually.

If you have more blocking operations, the event queue gets even worse:

Non-async blocking operations example in Node Hero tutorial series. (The red bars show when the process is waiting for an external resource's response and is blocked, the black bars show when your code is running, the green bars show the rest of the application)

To resolve this issue, Node.js introduced an asynchronous programming model.

Node.js Monitoring and Debugging from the Experts of RisingStack

Build performant applications using Trace
Learn more

Asynchronous programming in Node.js

Asynchronous I/O is a form of input/output processing that permits other processing to continue before the transmission has finished.

In the following example, I will show you a simple file reading process in Node.js - both in a synchronous and asynchronous way, with the intention of show you what can be achieved by avoiding blocking your applications.

Let's start with a simple example - reading a file using Node.js in a synchronous way:

const fs = require('fs')  
let content  
try {  
  content = fs.readFileSync('file.md', 'utf-8')
} catch (ex) {
  console.log(ex)
}
console.log(content)  

What did just happen here? We tried to read a file using the synchronous interface of the fs module. It works as expected - the content variable will contain the content of file.md. The problem with this approach is that Node.js will be blocked until the operation is finished - meaning it can do absolutely nothing while the file is being read.

Let's see how we can fix it!

Asynchronous programming - as we know now in JavaScript - can only be achieved with functions being first-class citizens of the language: they can be passed around like any other variables to other functions. Functions that can take other functions as arguments are called higher-order functions.

One of the easiest example for higher order functions:

const numbers = [2,4,1,5,4]

function isBiggerThanTwo (num) {  
  return num > 2
}

numbers.filter(isBiggerThanTwo)  

In the example above we pass in a function to the filter function. This way we can define the filtering logic.

This is how callbacks were born: if you pass a function to another function as a parameter, you can call it within the function when you are finished with your job. No need to return values, only calling another function with the values.

These so-called error-first callbacks are in the heart of Node.js itself - the core modules are using it as well as most of the modules found on NPM.

const fs = require('fs')  
fs.readFile('file.md', 'utf-8', function (err, content) {  
  if (err) {
    return console.log(err)
  }

  console.log(content)
})

Things to notice here:

  • error-handling: instead of a try-catch block you have to check for errors in the callback
  • no return value: async functions don't return values, but values will be passed to the callbacks

Let's modify this file a little bit to see how it works in practice:

const fs = require('fs')

console.log('start reading a file...')

fs.readFile('file.md', 'utf-8', function (err, content) {  
  if (err) {
    console.log('error happened during reading the file')
    return console.log(err)
  }

  console.log(content)
})

console.log('end of the file')  

The output of this script will be:

start reading a file...  
end of the file  
error happened during reading the file  

As you can see once we started to read our file the execution continued, and the application printed end of the file. Our callback was only called once the file read was finished. How is it possible? Meet the event loop.

The Event Loop

The event loop is in the heart of Node.js / Javascript - it is responsible for scheduling asynchronous operations.

Before diving deeper, let's make sure we understand what event-driven programming is.

Event-driven programming is a programming paradigm in which the flow of the program is determined by events such as user actions (mouse clicks, key presses), sensor outputs, or messages from other programs/threads.

In practice, it means that applications act on events.

Also, as we have already learned in the first chapter, Node.js is single-threaded - from a developer's point of view. It means that you don't have to deal with threads and synchronizing them, Node.js abstracts this complexity away. Everything except your code is executing in parallel.

To understand the event loop more in-depth, continue watching this video:

Async Control Flow

As now you have a basic understanding of how async programming works in JavaScript, let's take a look at a few examples on how you can organize your code.

Async.js

To avoid the so-called Callback-Hell one thing you can do is to start using async.js.

Async.js helps to structure your applications and makes control flow easier.

Let’s check a short example of using Async.js, and then rewrite it by using Promises.

The following snippet maps through three files for stats on them:

async.parallel(['file1', 'file2', 'file3'], fs.stat, function (err, results) {  
    // results is now an array of stats for each file
})

Promises

The Promise object is used for deferred and asynchronous computations. A Promise represents an operation that hasn't completed yet but is expected in the future.

In practice, the previous example could be rewritten as follows:

function stats (file) {  
  return new Promise((resolve, reject) => {
    fs.stat(file, (err, data) => {
      if (err) {
        return reject (err)
      }
      resolve(data)
    })
  })
}

Promise.all([  
  stats('file1'),
  stats('file2'),
  stats('file3')
])
.then((data) => console.log(data))
.catch((err) => console.log(err))

Of course, if you use a method that has a Promise interface, then the Promise example can be a lot less in line count as well.

Download the whole Node Hero series as a single pdf

Next Up: Your First Node.js Server

In the next chapter, you will learn how to fire up your first Node.js HTTP server - subscribe to our newsletter for updates.

In the meantime if you have any questions, don't hesitate to ask!