Expert Node.js Support
Learn more
Gergely Nemeth's Picture

Gergely Nemeth

Node.js and microservices, organizer of @Oneshotbudapest @nodebp @jsconfbp

77 posts

How to Debug Node.js with the Best Tools Available

Debugging - the process of finding and fixing defects in software - can be a challenging task to do in all languages. Node.js is no exception.

Luckily, the tooling for finding these issues improved a lot in the past period. Let's take a look at what options you have to find and fix bugs in your Node.js applications!

We will dive into two different aspects of debugging Node.js applications - the first one will be logging, so you can keep an eye on production systems, and have events from there. After logging, we will take a look at how you can debug your applications in development environments.

Logging in Node.js

Logging takes place in the execution of your application to provide an audit trail that can be used to understand the activity of the system and to diagnose problems to find and fix bugs.

For logging purposes, you have lots of options when building Node.js applications. Some npm modules are shipped with built in logging that can be turned on when needed using the debug module. For your own applications, you have to pick a logger too! We will take a look at pino.

Before jumping into logging libraries, let's take a look what requirements they have to fulfil:

  • timestamps - it is crucial to know which event happened when,
  • formatting - log lines must be easily understandable by humans, and straightforward to parse for applications,
  • log destination - it should be always the standard output/error, applications should not concern themselves with log routing,
  • log levels - log events have different severity levels, in most cases, you won't be interested in debug or info level events.

The debug module of Node.js

Recommendation: use for modules published on npm

Let's see how it makes your life easier! Imagine that you have a Node.js module that sends serves requests, as well as send out some.

// index.js
const debugHttpIncoming = require('debug')('http:incoming')  
const debugHttpOutgoing = require('debug')('http:outgoing')

let outgoingRequest = {  
  url: 'https://risingstack.com'
}

// sending some request
debugHttpOutgoing('sending request to %s', outgoingRequest.url)

let incomingRequest = {  
  body: '{"status": "ok"}'
}

// serving some request
debugHttpOutgoing('got JSON body %s', incomingRequest.body)  

Once you have it, start your application this way:

DEBUG=http:incoming,http:outgoing node index.js  

The output will be something like this:

Output of Node.js Debugging

Also, the debug module supports wildcards with the * character. To get the same result we got previously, we simply could start our application with DEBUG=http:* node index.js.

What's really nice about the debug module is that a lot of modules (like Express or Koa) on npm are shipped with it - as of the time of writing this article more than 14.000 modules.

The pino logger module

Recommendation: use for your applications when performance is key

Pino is an extremely fast Node.js logger, inspired by bunyan. In many cases, pino is over 6x faster than alternatives like bunyan or winston:

benchWinston*10000:     2226.117ms  
benchBunyan*10000:      1355.229ms  
benchDebug*10000:       445.291ms  
benchLogLevel*10000:    322.181ms  
benchBole*10000:        291.727ms  
benchPino*10000:        269.109ms  
benchPinoExtreme*10000: 102.239ms  

Getting started with pino is straightforward:

const pino = require('pino')()

pino.info('hello pino')  
pino.info('the answer is %d', 42)  
pino.error(new Error('an error'))  

The above snippet produces the following log lines:

{"pid":28325,"hostname":"Gergelys-MacBook-Pro.local","level":30,"time":1492858757722,"msg":"hello pino","v":1}
{"pid":28325,"hostname":"Gergelys-MacBook-Pro.local","level":30,"time":1492858757724,"msg":"the answer is 42","v":1}
{"pid":28325,"hostname":"Gergelys-MacBook-Pro.local","level":50,"time":1492858757725,"msg":"an error","type":"Error","stack":"Error: an error\n    at Object.<anonymous> (/Users/gergelyke/Development/risingstack/node-js-at-scale-debugging/pino.js:5:12)\n    at Module._compile (module.js:570:32)\n    at Object.Module._extensions..js (module.js:579:10)\n    at Module.load (module.js:487:32)\n    at tryModuleLoad (module.js:446:12)\n    at Function.Module._load (module.js:438:3)\n    at Module.runMain (module.js:604:10)\n    at run (bootstrap_node.js:394:7)\n    at startup (bootstrap_node.js:149:9)\n    at bootstrap_node.js:509:3","v":1}

The Built-in Node.js Debugger module

Node.js ships with an out-of-process debugging utility, accessible via a TCP-based protocol and built-in debugging client. You can start it using the following command:

$ node debug index.js

This debugging agent is a not a fully featured debugging agent - you won't have a fancy user interface, however, simple inspections are possible.

You can add breakpoints to your code by adding the debugger statement into your codebase:

const express = require('express')  
const app = express()

app.get('/', (req, res) => {  
  debugger
  res.send('ok')
})

This way the execution of your script will be paused at that line, then you can start using the commands exposed by the debugging agent:

  • cont or c - continue execution,
  • next or n - step next,
  • step or s - step in,
  • out or o - step out,
  • repl - to evaluate script's context.

V8 Inspector Integration for Node.js

The V8 inspector integration allows attaching Chrome DevTools to Node.js instances for debugging by using the Chrome Debugging Protocol.

V8 Inspector can be enabled by passing the --inspect flag when starting a Node.js application:

$ node --inspect index.js

In most cases, it makes sense to stop the execution of the application at the very first line of your codebase and continue the execution from that. This way you won't miss any command execution.

$ node --inspect-brk index.js


I recommend watching this video in full-screen mode to get every detail!

How to Debug Node.js with Visual Studio Code

Most modern IDEs have some support for debugging applications - so does VS Code. It has built-in debugging support for Node.js.

What you can see below, is the debugging interface of VS Code - with the context variables, watched expressions, call stack and breakpoints.

VS Code Debugging Layout Image credit: Visual Studio Code

One of the most valuable features of the integrated Visual Studio Code debugger is the ability to add conditional breakpoints. With conditional breakpoints, the breakpoint will be hit whenever the expression evaluates to true.

If you need more advanced settings for VS Code, it comes with a configuration file, .vscode/launch.json which describes how the debugger should be launched. The default launch.json looks something like this:

{
    "version": "0.2.0",
    "configurations": [
        {
            "type": "node",
            "request": "launch",
            "name": "Launch Program",
            "program": "${workspaceRoot}/index.js"
        },
        {
            "type": "node",
            "request": "attach",
            "name": "Attach to Port",
            "address": "localhost",
            "port": 5858
        }
    ]
}

For advanced configuration settings of launch.json go to https://code.visualstudio.com/docs/editor/debugging#_launchjson-attributes.

For more information on debugging with Visual Studio Code, visit the official site: https://code.visualstudio.com/docs/editor/debugging.

Next Up

If you have any questions about debugging, please let me know in the comments section.

In the next episode of the Node.js at Scale series, we are going to talk about profiling Node.js applications.

Announcing Free Node.js Monitoring & Debugging with Trace

Today, we’re excited to announce that Trace, our Node.js monitoring & debugging tool is now free for open-source projects.

What is Trace?

We launched Trace a year ago with the intention of helping developers looking for a Node.js specific APM which is easy to use and helps with the most difficult aspects of building Node projects, like..

  • finding memory leaks in a production environment
  • profiling CPU usage to find bottlenecks
  • tracing distributed call-chains
  • avoiding security leaks & bad npm packages

.. and so on.

Node.js Monitoring with Trace by RisingStack - Performance Metrics chart

Why are we giving it away for free?

We use a ton of open-source technology every day, and we are also the maintainers of some.

We know from experience that developing an open-source project is hard work, which requires a lot of knowledge and persistence.

Trace will save a lot of time for those who use Node for their open-source projects.

How to get started with Trace?

  1. Visit trace.risingstack.com and sign up - it's free.
  2. Connect your app with Trace.
  3. Head over to this form and tell us a little bit about your project.

Done. Your open-source project will be monitored for free as a result.

If you need help with Node.js Monitoring & Debugging..

Just drop us a tweet at @RisingStack if you have any additional questions about the tool or the process.

If you'd like to read a little bit more about the topic, I recommend to read our previous article The Definitive Guide for Monitoring Node.js Applications.

One more thing

At the same time of making Trace available for open-source projects, we're announcing our new line of business at RisingStack:

Commercial Node.js support, aimed at enterprises with Node.js applications running in a production environment.

RisingStack now helps to bootstrap and operate Node.js apps - no matter what life cycle they are in.


Disclaimer: We retain the exclusive right to accept or deny your application to use Trace by RisingStack for free.

Mastering the Node.js CLI & Command Line Options

Node.js comes with a lot of CLI options to expose built-in debugging & to modify how V8, the JavaScript engine works.

In this post, we have collected the most important CLI commands to help you become more productive.

Accessing Node.js CLI Options

To get a full list of all available Node.js CLI options in your current distribution of Node.js, you can access the manual page from the terminal using:

$ man node

Usage: node [options] [ -e script | script.js ] [arguments]  
       node debug script.js [arguments] 

Options:  
  -v, --version         print Node.js version
  -e, --eval script     evaluate script
  -p, --print           evaluate script and print result
  -c, --check           syntax check script without executing
...

As you can see in the first usage section, you have to provide the optional options before the script you want to run.

Take the following file:

console.log(new Buffer(100))  

To take advantage of the --zero-fill-buffers option, you have to run your application using:

$ node --zero-fill-buffers index.js

This way the application will produce the correct output, instead of random memory garbage:

<Buffer 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... >  

CLI Options

Now as we saw how you instruct Node.js to use CLI options, let's see what other options are there!

--version or -v

Using the node --version, or short, node -v, you can print the version of Node.js you are using.

$ node -v
v6.10.0  

--eval or -e

Using the --eval option, you can run JavaScript code right from your terminal. The modules which are predefined in REPL can also be used without requiring them, like the http or the fs module.

$ node -e 'console.log(3 + 2)'
5  

--print or -p

The --print option works the same way as the --eval, however it prints the result of the expression. To achieve the same output as the previous example, we can simply leave the console.log:

$ node -p '3 + 2'
5  

--check or -c

Available since v4.2.0

The --check option instructs Node.js to check the syntax of the provided file, without actually executing it.

Take the following example again:

console.log(new Buffer(100)  

As you can see, a closing ) is missing. Once you run this file using node index.js, it will produce the following output:

/Users/gergelyke/Development/risingstack/mastering-nodejs-cli/index.js:1
(function (exports, require, module, __filename, __dirname) { console.log(new Buffer(100)
                                                                                        ^
SyntaxError: missing ) after argument list  
    at Object.exports.runInThisContext (vm.js:76:16)
    at Module._compile (module.js:542:28)
    at Object.Module._extensions..js (module.js:579:10)
    at Module.load (module.js:487:32)
    at tryModuleLoad (module.js:446:12)
    at Function.Module._load (module.js:438:3)
    at Module.runMain (module.js:604:10)
    at run (bootstrap_node.js:394:7)

Using the --check option you can check for the same issue, without executing the script, using node --check index.js. The output will be similar, except you won't get the stack trace, as the script never ran:

/Users/gergelyke/Development/risingstack/mastering-nodejs-cli/index.js:1
(function (exports, require, module, __filename, __dirname) { console.log(new Buffer(100)
                                                                                        ^
SyntaxError: missing ) after argument list  
    at startup (bootstrap_node.js:144:11)
    at bootstrap_node.js:509:3

The --check option can come handy, when you want to see if your script is syntactically correct, without executing it.


Expert help when you need it the most

Commercial Node.js Support by RisingStack
Learn more


--inspect[=host:port]

Available since v6.3.0

Using node --inspect will activate the inspector on the provided host and port. If they are not provided, the default is 127.0.0.1:9229. The debugging tools attached to Node.js instances communicate via a tcp port using the Chrome Debugging Protocol.

--inspect-brk[=host:port]

Available since v7.6.0

The --inspect-brk has the same functionality as the --inspect option, however it pauses the execution at the first line of the user script.

$ node --inspect-brk index.js 
Debugger listening on port 9229.  
Warning: This is an experimental feature and could change at any time.  
To start debugging, open the following URL in Chrome:  
    chrome-devtools://devtools/bundled/inspector.html?experiments=true&v8only=true&ws=127.0.0.1:9229/86dd44ef-c865-479e-be4d-806d622a4813

Once you have ran this command, just copy and paste the URL you got to start debugging your Node.js process.

--zero-fill-buffers

Available since v6.0.0

Node.js can be started using the --zero-fill-buffers command line option to force all newly allocated Buffer instances to be automatically zero-filled upon creation. The reason to do so is that newly allocated Buffer instances can contain sensitive data.

It should be used when it is necessary to enforce that newly created Buffer instances cannot contain sensitive data, as it has significant impact on performance.

Also note, that some Buffer constructors got deprecated in v6.0.0:

  • new Buffer(array)
  • new Buffer(arrayBuffer[, byteOffset [, length]])
  • new Buffer(buffer)
  • new Buffer(size)
  • new Buffer(string[, encoding])

Instead, you should use Buffer.alloc(size[, fill[, encoding]]), Buffer.from(array), Buffer.from(buffer), Buffer.from(arrayBuffer[, byteOffset[, length]]) and Buffer.from(string[, encoding]).

You can read more on the security implications of the Buffer module on the Synk blog.

--prof-process

Using the --prof-process, the Node.js process will output the v8 profiler output.

To use it, first you have to run your applications using:

node --prof index.js  

Once you ran it, a new file will be placed in your working directory, with the isolate- prefix.

Then, you have to run the Node.js process with the --prof-process option:

node --prof-process isolate-0x102001600-v8.log > output.txt  

This file will contain metrics from the V8 profiler, like how much time was spent in the C++ layer, or in the JavaScript part, and which function calls took how much time. Something like this:

[C++]:
   ticks  total  nonlib   name
     16   18.4%   18.4%  node::ContextifyScript::New(v8::FunctionCallbackInfo<v8::Value> const&)
      4    4.6%    4.6%  ___mkdir_extended
      2    2.3%    2.3%  void v8::internal::String::WriteToFlat<unsigned short>(v8::internal::String*, unsigned short*, int, int)
      2    2.3%    2.3%  void v8::internal::ScavengingVisitor<(v8::internal::MarksHandling)1, (v8::internal::LoggingAndProfiling)0>::ObjectEvacuationStrategy<(v8::internal::ScavengingVisitor<(v8::internal::MarksHandling)1, (v8::internal::LoggingAndProfiling)0>::ObjectContents)1>::VisitSpecialized<24>(v8::internal::Map*, v8::internal::HeapObject**, v8::internal::HeapObject*)

[Summary]:
   ticks  total  nonlib   name
      1    1.1%    1.1%  JavaScript
     70   80.5%   80.5%  C++
      5    5.7%    5.7%  GC
      0    0.0%          Shared libraries
     16   18.4%          Unaccounted

To get a full list of Node.js CLI options, check out the official documentation here.


V8 Options

You can print all the available V8 options using the --v8-options command line option.

Currently V8 exposes more than a 100 command line options - here we just picked a few to showcase some of the functionality they can provide. Some of these options can drastically change how V8 behaves, use them with caution!

--harmony

With the harmony flag, you can enable all completed harmony features.

--max_old_space_size

With this option, you can set the maximum size of the old space on the heap, which directly affects how much memory your process can allocate.

This setting can come handy when you run in low memory environments.

--optimize_for_size

With this option, you can instruct V8 to optimize the memory space for size - even if the application gets slower.

Just as the previous option, it can be useful in low memory environments.


Environment Variables

NODE_DEBUG=module[,…]

Setting this environment variable enables the core modules to print debug information. You can run the previous example like this to get debug information on the module core component (instead of module, you can go for http, fs, etc...):

$ NODE_DEBUG=module node index.js

The output will be something like this:

MODULE 7595: looking for "/Users/gergelyke/Development/risingstack/mastering-nodejs-cli/index.js" in ["/Users/gergelyke/.node_modules","/Users/gergelyke/.node_libraries","/Users/gergelyke/.nvm/versions/node/v6.10.0/lib/node"]  
MODULE 7595: load "/Users/gergelyke/Development/risingstack/mastering-nodejs-cli/index.js" for module "."  

NODE_PATH=path

Using this setting, you can add extra paths for the Node.js process to search modules in.

OPENSSL_CONF=file

Using this environment variable, you can load an OpenSSL configuration file on startup.

For a full list of supported environment variables, check out the official Node.js docs.

Let's Contribute to the CLI related Node Core Issues!

As you can see, the CLI is a really useful tool which becomes better with each Node version!

If you'd like to contribute to its advancement, you can help by checking out the currently open issues at https://github.com/nodejs/node/labels/cli !

Node.js Command Line Interface CLI Issues

Node.js War Stories: Debugging Issues in Production

In this article, you can read stories from Netflix, RisingStack & nearForm about Node.js issues in production - so you can learn from our mistakes and avoid repeating them. You'll also learn what methods we used to debug these Node.js issues.

Special shoutout to Yunong Xiao of Netflix, Matteo Collina of nearForm & Shubhra Kar from Strongloop for helping us with their insights for this post!


At RisingStack, we have accumulated a tremendous experience of running Node apps in production in the past 4 years - thanks to our Node.js consulting, training and development business.

As well as the Node teams at Netflix & nearForm we picked up the habit of always writing thorough postmortems, so the whole team (and now the whole world) could learn from the mistakes we made.

Netflix & Debugging Node: Know your Dependencies

Let's start with a slowdown story from Yunong Xiao, which happened with our friends at Netflix.

The trouble started with the Netflix team noticing that their applications response time increased progressively - some of their endpoints' latency increased with 10ms every hour.

This was also reflected in the growing CPU usage.

Netflix debugging Nodejs in production with the Request latency graph Request latencies for each region over time - photo credit: Netflix

At first, they started to investigate whether the request handler is responsible for slowing things down.

After testing it in isolation, it turned out that the request handler had a constant response time around 1ms.

So the problem was not that, and they started to suspect that probably it's deeper in the stack.

The next thing Yunong & the Netflix team tried are CPU flame graphs and Linux Perf Events.

Flame graph of Netflix Nodejs slowdown Flame graph or the Netflix slowdown - photo credit: Netflix

What you can see in the flame graph above is that

  • it has high stacks (which means lot of function calls)
  • and the boxes are wide (meaning we are spending quite some time in those functions).

After further inspection, the team found that Express's router.handle and router.handle.next has lots of references.

The Express.js source code reveals a couple of interesting tidbits:

  • Route handlers for all endpoints are stored in one global array.
  • Express.js recursively iterates through and invokes all handlers until it finds the right route handler.

Before revealing the solution of this mystery, we have to get one more detail:

Netflix's codebase contained a periodical code that ran every 6 minutes and grabbed new route configs from an external resource and updated the application's route handlers to reflect the changes.

This was done by deleting old handlers and adding new ones. Accidentally, it also added the same static handler all over again - even before the API route handlers. As it turned out, this caused the extra 10ms response time hourly.

Takeaways from Netflix's Issue

  • Always know your dependencies - first, you have to fully understand them before going into production with them.
  • Observability is key - flame graphs helped the Netflix engineering team to get to the bottom of the issue.

Read the full story here: Node.js in Flames.


Expert help when you need it the most

Commercial Node.js Support by RisingStack
Learn more


RisingStack CTO: "Crypto takes time"

You may have already heard to story of how we broke down the monolithic infrastructure of Trace (our Node.js monitoring solution) into microservices from our CTO, Peter Marton.

The issue we'll talk about now is a slowdown which affected Trace in production:

As the very first versions of Trace ran on a PaaS, it used the public cloud to communicate with other services of ours.

To ensure the integrity of our requests, we decided to sign all of them. To do so, we went with Joyent's HTTP signing library. What's really great about it, is that the request module supports HTTP signature out of the box.

This solution was not only expensive, but it also had a bad impact on our response times.

network delay in nodejs request visualized by trace The network delay built up our response times - photo: Trace

As you can see on the graph above, the given endpoint had a response time of 180ms, however from that amount, 100ms was just the network delay between the two services alone.

As the first step, we migrated from the PaaS provider to use Kubernetes. We expected that our response times would be a lot better, as we can leverage internal networking.

We were right - our latency improved.

However, we expected better results - and a lot bigger drop in our CPU usage. The next step was to do CPU profiling, just like the guys at Netflix:

crypto sign function taking up cpu time

As you can see on the screenshot, the crypto.sign function takes up most of the CPU time, by consuming 10ms on each request. To solve this, you have two options:

  • if you are running in a trusted environment, you can drop request signing,
  • if you are in an untrusted environment, you can scale up your machines to have stronger CPUs.

Takeaways from Peter Marton

  • Latency in-between your services has a huge impact on user experience - whenever you can, leverage internal networking.
  • Crypto can take a LOT of time.

nearForm: Don't block the Node.js Event Loop

React is more popular than ever. Developers use it for both the frontend and the backend, or they even take a step further and use it to build isomorphic JavaScript applications.

However, rendering React pages can put some heavy load on the CPU, as rendering complex React components is CPU bound.

When your Node.js process is rendering, it blocks the event loop because of its synchronous nature.

As a result, the server can become entirely unresponsive - requests accumulate, which all puts load on the CPU.

What can be even worse is that even those requests will be served which no longer have a client - still putting load on the Node.js application, as Matteo Collina of nearForm explains.

It is not just React, but string operations in general. If you are building JSON REST APIs, you should always pay attention to JSON.parse and JSON.stringify.

As Shubhra Kar from Strongloop (now Joyent) explained, parsing and stringifying huge payloads can take a lot of time as well (and blocking the event loop in the meantime).

function requestHandler(req, res) {  
  const body = req.rawBody
  let parsedBody
  try {
    parsedBody = JSON.parse(body)
  }
  catch(e) {
     res.end(new Error('Error parsing the body'))
  }
  res.end('Record successfully received')
}

Simple request handler

The example above shows a simple request handler, which just parses the body. For small payloads, it works like a charm - however, if the JSON's size can be measured in megabytes, the execution time can be seconds instead of milliseconds. The same applies for JSON.stringify.

To mitigate these issues, first, you have to know about them. For that, you can use Matteo's loopbench module, or Trace's event loop metrics feature.

With loopbench, you can return a status code of 503 to the load balancer, if the request cannot be fulfilled. To enable this feature, you have to use the instance.overLimit option. This way ELB or NGINX can retry it on a different backend, and the request may be served.

Once you know about the issue and understand it, you can start working on fixing it - you can do it either by leveraging Node.js streams or by tweaking the architecture you are using.

Takeaways from nearForm

  • Always pay attention to CPU bound operations - the more you have, to more pressure you put on your event loop.
  • String operations are CPU-heavy operations

Debugging Node.js Issues in Production

I hope these examples from Netflix, RisingStack & nearForm will help you to debug your Node.js apps in Production.

If you'd like to learn more, I recommend checking out these recent posts which will help you to deepen your Node knowledge:

If you have any questions, please let us know in the comments!

The Definitive Guide for Monitoring Node.js Applications

In the previous chapters of Node.js at Scale we learned how you can get Node.js testing and TDD right, and how you can use Nightwatch.js for end-to-end testing.

In this article, we will learn about running and monitoring Node.js applications in Production. Let's discuss these topics:

  • What is monitoring?
  • What should be monitored?
  • Open-source monitoring solutions
  • SaaS and On-premise monitoring offerings

What is Node.js Monitoring?

Monitoring is observing the quality of a software over time. The available products and tools we have in this industry usually go by the term Application Performance Monitoring or APM in short.

If you have a Node.js application in a staging or production environment, you can (and should) do monitoring on different levels:

You can monitor

  • regions,
  • zones,
  • individual servers and,
  • of course, the Node.js software that runs on them.

In this guide we will deal with the software components only, as if you run in a cloud environment, the others are taken care for you usually.

What should be monitored?

Each application you write in Node.js produces a lot of data about its' behavior.

There are different layers where an APM tool should collect data from. The more of them covered, the more insights you'll get about your system's behavior.

  • Service level
  • Host level
  • Instance (or process) level

The list you can find below collects the most crucial problems you'll run into while you maintain a Node.js application in production. We'll also discuss how monitoring helps to solve them and what kind of data you'll need to do so.

Problem 1.: Service Downtimes

If your application is unavailable, your customers can't spend money on your sites. If your API's are down, your business partners and services depending on them will fail as well because of you.

We all know how cringeworthy it is to apologize for service downtimes.

Your topmost priority should be preventing failures and providing 100% availability for your application.

Running a production app comes with great responsibility.

Node.js APM's can easily help you detecting and preventing downtimes, since they usually collect service level metrics.

This data can show if your application handles requests properly, although it won't always help to tell if your public sites or API's are available.

To have a proper coverage on downtimes, we recommend to set up a pinger as well which can emulate user behavior and provide foolproof data on availability. If you want to cover everything, don't forget to include different regions like the US, Europe and Asia too.

Problem 2.: Slow Services, Terrible Response Times

Slow response times have a huge impact on conversion rate, as well as on product usage. The faster your product is the more customers and user satisfaction you'll have.

Usually, all Node.js APM's can show if your services are slowing down, but interpreting that data requires further analysis.

I recommend doing two things to find the real reasons for slowing services.

  • Collect data on a process level too. Check out each instance of a service to figure out what happens under the hood.
  • Request CPU profiles when your services slow down and analyze them to find the faulty functions.

Eliminating performance bottlenecks enables you to scale your software more efficiently and also to optimize your budget.

Problem 3.: Solving Memory Leaks is Hard

Our Node.js Consulting & Development expertise allowed us to build huge enterprise systems and help developers making them better.

What we see constantly is that Memory Leaks in Node.js applications are quite frequent and that finding out what causes them is among the greatest struggles Node developers face.

This impression is backed with data as well. Our Node.js Developer Survey showed that Memory Leaks cause a lot of headache for even the best engineers.

To find memory leaks, you have to know exactly when they happen.

Some APM's collect memory usage data which can be used to recognize a leak. What you should look for is the steady growth of memory usage which ends up in a service crash & restart (as Node runs out of memory after 1,4 Gigabytes).

Node.js memory leak shown in Trace, the node.js monitoring tool

If your APM collects data on the Garbage Collector as well, you can look for the same pattern. As extra objects in a Node app's memory pile up, the time spent with Garbage Collection increases simultaneously. This is a great indicator of the Memory Leak.

After figuring out that you have a leak, request a memory heapdump and look for the extra objects!

This sounds easy in theory but can be challenging in practice.

What you can do is request 2 heapdumps from your production system with a Monitoring tool, and analyze these dumps with Chrome's DevTools. If you look for the extra objects in comparison mode, you'll end up seeing what piles up in your app's memory.

If you'd like a more detailed rundown on these steps, I wrote one article about finding a Node.js memory leak in Ghost, where I go into more details.

Problem 4.: Depending on Code Written by Anonymus

Most of the Node.js applications heavily rely on npm. We can end up with a lot of dependencies written by developers of unknown expertise and intentions.

Roughly 76% of Node shops use vulnerable packages, while open source projects regularly grow stale, neglecting to fix security flaws.

There are a couple of possible steps to lower the security risks of using npm packages.

  1. Audit your modules with the Node Security Platform CLI
  2. Look for unused dependencies with the depcheck tool
  3. Use the npm stats API, or browse historic stats on npm-stat.com to find out if others using a package
  4. Use the npm view <pkg> maintainers command to avoid packages maintained by only a few
  5. Use the npm outdated command or Greenkeeper to learn whether you're using the latest version of a package.

Going through these steps can consume a lot of your time, so picking a Node.js Monitoring Tool which can warn you about insecure dependencies is highly recommended.

Problem 6.: Email Alerts often go Unnoticed

Let's be honest. We are developers who like spending time writing code - not going through our email account every 10 minutes..

According to my experience, email alerts are usually unread and it's very easy to miss out on a major outage or problem if we depend only on them.

Email is a subpar method to learn about issues in production.

I guess that you also don't want to watch dashboards for potential issues 24/7. This is why it is important to look for an APM with great alerting capabilities.

What I recommend is to use pager systems like opsgenie or pagerduty to learn about critical issues. Pair up the monitoring solution of your choice with one of these systems if you'd like to know about your alerts instantly.

A few alerting best-practices we follow at RisingStack:

  • Always keep alerting simple and alert on symptoms
  • Aim to have as few alerts as possible - associated with end-user pain
  • Alert on high response time and error rates as high up in the stack as possible

Problem 7.: Finding Crucial Errors in the Code

If a feature is broken on your site, it can prevent customers from achieving their goals. Sometimes it can be a sign of bad code quality. Make sure you have proper test coverage for your codebase and a good QA process (preferably automated).

If you use an APM that collects errors from your app then you'll be able to find the ones which occur more often.

The more data your APM is accessing, better the chances of finding and fixing critical issues. We recommend to use a monitoring tool which collects and visualises stack traces as well - so you'll be able to find the root causes of errors in a distributed system.


In the next part of the article, I will show you one open-source, and one SaaS / on-premises Node.js monitoring solution that will help you operate your applications.

Prometheus - an Open-Source, General Purpose Monitoring Platform

Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud.

Prometheus was started in 2012, and since then, many companies and organizations have adopted the tool. It is a standalone open source project and maintained independently of any company.

In 2016, Prometheus joined the Cloud Native Computing Foundation, right after Kubernetes.

The most important features of Prometheus are:

  • a multi-dimensional data model (time series identified by metric name and key/value pairs),
  • a flexible query language to leverage this dimensionality,
  • time series collection happens via a pull model over HTTP by default,
  • pushing time series is supported via an intermediary gateway.

Node.js monitoring with prometheus

As you could see from the previous features, Prometheus is a general purpose monitoring solution, so you can use it with any language or technology you prefer.

Check out the official Prometheus getting started pages if you'd like to give it a try.

Before you start monitoring your Node.js services, you need to add instrumentation to them via one of the Prometheus client libraries.

For this, there is a Node.js client module, which you can find here. It supports histograms, summaries, gauges and counters.

Essentially, all you have to do is require the Prometheus client, then expose its output at an endpoint:

const Prometheus = require('prom-client')  
const server = require('express')()

server.get('/metrics', (req, res) => {  
  res.end(Prometheus.register.metrics())
})

server.listen(process.env.PORT || 3000)  

This endpoint will produce an output, that Prometheus can consume - something like this:

# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1490433285  
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 33046528  
# HELP nodejs_eventloop_lag_seconds Lag of event loop in seconds.
# TYPE nodejs_eventloop_lag_seconds gauge
nodejs_eventloop_lag_seconds 0.000089751  
# HELP nodejs_active_handles_total Number of active handles.
# TYPE nodejs_active_handles_total gauge
nodejs_active_handles_total 4  
# HELP nodejs_active_requests_total Number of active requests.
# TYPE nodejs_active_requests_total gauge
nodejs_active_requests_total 0  
# HELP nodejs_version_info Node.js version info.
# TYPE nodejs_version_info gauge
nodejs_version_info{version="v4.4.2",major="4",minor="4",patch="2"} 1  

Of course, these are just the default metrics which were collected by the module we have used - you can extend it with yours. In the example below we collect the number of requests served:

const Prometheus = require('prom-client')  
const server = require('express')()

const PrometheusMetrics = {  
  requestCounter: new Prometheus.Counter('throughput', 'The number of requests served')
}

server.use((req, res, next) => {  
  PrometheusMetrics.requestCounter.inc()
  next()
})

server.get('/metrics', (req, res) => {  
  res.end(Prometheus.register.metrics())
})

server.listen(3000)  

Once you run it, the /metrics endpoint will include the throughput metrics as well:

# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1490433805  
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 25120768  
# HELP nodejs_eventloop_lag_seconds Lag of event loop in seconds.
# TYPE nodejs_eventloop_lag_seconds gauge
nodejs_eventloop_lag_seconds 0.144927586  
# HELP nodejs_active_handles_total Number of active handles.
# TYPE nodejs_active_handles_total gauge
nodejs_active_handles_total 0  
# HELP nodejs_active_requests_total Number of active requests.
# TYPE nodejs_active_requests_total gauge
nodejs_active_requests_total 0  
# HELP nodejs_version_info Node.js version info.
# TYPE nodejs_version_info gauge
nodejs_version_info{version="v4.4.2",major="4",minor="4",patch="2"} 1  
# HELP throughput The number of requests served
# TYPE throughput counter
throughput 5  

Once you have exposed all the metrics you have, you can start querying and visualizing them - for that, please refer to the official Prometheus query documentation and the vizualization documentation.

As you can imagine, instrumenting your codebase can take quite some time - since you have to create your dashboard and alerts to make sense of the data. While sometimes these solutions can provide greater flexibility for your use-case than hosted solutions, it can take months to implement them & then you have to deal with operating them as well.

If you have the time to dig deep into the topic, you'll be fine with it.

Meet Trace - our SaaS, and On-premises Node.js Monitoring Tool

As we just discussed, running your own solution requires domain knowledge, as well as expertise on how to do proper monitoring. You have to figure out what aggregation to use for what kind of metrics, and so on..

This is why it can make a lot of sense to go with a hosted monitoring solution - whether it is a SaaS product or an on-premises offering.

At RisingStack, we are developing our own Node.js Monitoring Solution, called Trace. We built all the experience into Trace which we gained through the years of providing professional Node services.

What's nice about Trace, is that you have all the metrics you need with only adding a single line of code to your application - so it really takes only a few seconds to get started.

require([email protected]/trace')  

After this, the Trace collector automatically gathers your application's performance data and visualizes it for you in an easy to understand way.

Just a few things Trace is capable to do with your production Node app:

  1. Send alerts about Downtimes, Slow services & Bad Status Codes.
  2. Ping your websites and API's with an external service + show APDEX metrics.
  3. Collect data on service, host and instance levels as well.
  4. Automatically create a (10 second-long) CPU profile in a production environment in case of a slowdown.
  5. Collect data on memory consumption and garbage collection.
  6. Create memory heapdumps automatically in case of a Memory Leak in production.
  7. Show errors and stack traces from your application.
  8. Visualize whole transaction call-chains in a distributed system.
  9. Show how your services communicate with each other on a live map.
  10. Automatically detect npm packages with security vulnerabilities.
  11. Mark new deployments and measure their effectiveness.
  12. Integrate with Slack, Pagerduty, and Opsgenie - so you'll never miss an alert.

Although Trace is currently a SaaS solution, we'll make an on-premises version available as well soon.

It will be able to do exactly the same as the cloud version, but it will run on Amazon VPC or in your own datacenter. If you're interested in it, let's talk!

Summary

I hope that in this chapter of Node.js at Scale I was able to give useful advice about monitoring your Node.js application. In the next article, you will learn how to debug Node.js applications in an easy way.

Node.js End-to-End Testing with Nightwatch.js

In this article, we are going to take a look at how you can do end-to-end testing with Node.js, using Nightwatch.js, a Node.js powered end-to-end testing framework.

In the previous chapter of Node.js at Scale, we discussed Node.js Testing and Getting TDD Right. If you did not read that article, or if you are unfamiliar with unit testing and TDD (test-driven development), I recommend checking that out before continuing with this article.

What is Node.js end-to-end testing?

Before jumping into example codes and learning to implement end-to-end testing for a Node.js project, it's worth exploring what end-to-end tests really are.

First of all, end-to-end testing is part of the black-box testing toolbox. This means that as a test writer, you are examining functionality without any knowledge of internal implementation. So without seeing any source code.

Secondly, end-to-end testing can also be used as user acceptance testing, or UAT. UAT is the process of verifying that the solution actually works for the user. This process is not focusing on finding small typos, but issues that can crash the system, or make it dysfunctional for the user.

Enter Nightwatch.js

Nightwatch.js enables you to "write end-to-end tests in Node.js quickly and effortlessly that run against a Selenium/WebDriver server".

Nightwatch is shipped with the following features:

  • a built-in test runner,
  • can control the selenium server,
  • support for hosted selenium providers, like BrowserStack or SauceLabs,
  • CSS and Xpath selectors.

Installing Nightwatch

To run Nightwatch locally, we have to do a little bit of extra work - we will need a standalone Selenium server locally, as well as a webdriver, so we can use Chrome/Firefox to test our applications locally.

With these three tools, we are going to implement the flow this diagram shows below.

node.js end-to-end testing with nightwatch.js flowchart Photo credit: nightwatchjs.org

STEP 1: Add Nightwatch

You can add Nightwatch to your project simply by running npm install nightwatch --save-dev.

This places the Nightwatch executable in your ./node_modules/.bin folder, so you don't have to install it globally.

STEP 2: Download Selenium

Selenium is a suite of tools to automate web browsers across many platforms.

Prerequisite: make sure you have JDK installed, with at least version 7. If you don't have it, you can grab it from here.

The Selenium server is a Java application which is used by Nightwatch to connect to various browsers. You can download the binary from here.

Once you have downloaded the JAR file, create a bin folder inside your project, and place it there. We will set up Nightwatch to use it, so you don't have to manually start the Selenium server.

STEP 3: Download Chromedriver

ChromeDriver is a standalone server which implements the W3C WebDriver wire protocol for Chromium.

To grab the executable, head over to the downloads section, and place it to the same bin folder.

STEP 4: Configuring Nightwatch.js

The basic Nightwatch configuration happens through a json configuration file.

Let's create a nightwatch.json file, and fill it with:

{
  "src_folders" : ["tests"],
  "output_folder" : "reports",

  "selenium" : {
    "start_process" : true,
    "server_path" : "./bin/selenium-server-standalone-3.3.1.jar",
    "log_path" : "",
    "port" : 4444,
    "cli_args" : {
      "webdriver.chrome.driver" : "./bin/chromedriver"
    }
  },

  "test_settings" : {
    "default" : {
      "launch_url" : "http://localhost",
      "selenium_port"  : 4444,
      "selenium_host"  : "localhost",
      "desiredCapabilities": {
        "browserName": "chrome",
        "javascriptEnabled": true,
        "acceptSslCerts": true
      }
    }
  }
}

With this configuration file, we told Nightwatch where can it find the binary of the Selenium server and the Chromedriver, as well as the location of the tests we want to run.


You shouldn't rely only on e2e testing for QA. Trace helps you to find all issues before your users do.

Node.js monitoring & debugging from the experts of RisingStack
Learn more


Quick Recap

So far, we have installed Nightwatch, downloaded the standalone Selenium server, as well as the Chromedriver. With these steps, you have all the necessary tools to create end-to-end tests using Node.js and Selenium.

Writing your first Nightwatch Test

Let's add a new file in the tests folder, called homepage.js.

We are going to take the example from the Nightwatch getting started guide. Our test script will go to Google, search for Rembrandt, and check the Wikipedia page:

module.exports = {  
  'Demo test Google' : function (client) {
    client
      .url('http://www.google.com')
      .waitForElementVisible('body', 1000)
      .assert.title('Google')
      .assert.visible('input[type=text]')
      .setValue('input[type=text]', 'rembrandt van rijn')
      .waitForElementVisible('button[name=btnG]', 1000)
      .click('button[name=btnG]')
      .pause(1000)
      .assert.containsText('ol#rso li:first-child',
        'Rembrandt - Wikipedia')
      .end()
  }
}

The only thing left to do is to run Nightwatch itself! For that, I recommend adding a new script into our package.json's scripts section:

"scripts": {
  "test-e2e": "nightwatch"
}

The very last thing you have to do is to run the tests using this command:

npm run test-e2e  

If everything goes well, your test will open up Chrome, then Google and Wikipedia.

Nightwatch.js in Your Project

Now as you understood what end-to-end testing is, and how you can set up Nightwatch, it is time to start adding it to your project.

For that, you have to consider some aspects - but please note, that there are no silver bullets here. Depending on your business needs, you may answer the following questions differently:

  • Where should I run? On staging? On production? When don I build my containers?
  • What are the test scenarios I want to test?
  • When and who should write end-to-end tests?

Summary & Next Up

In this chapter of Node.js at Scale we have learned:

  • how to set up Nightwatch,
  • how to configure it to use a standalone Selenium server,
  • and how to write basic end-to-end tests.

In the next chapter, we are going to explore how you can monitor production Node.js infrastructures.

Digital Transformation with the Node.js Stack

In this article, we explore the 9 main areas of digital transformation and show what are the benefits of implementing Node.js. At the end, we’ll lay out a Digital Transformation Roadmap to help you get started with this process.

Note, that implementing Node.js is not the goal of a digital transformation project - it is just a great tool that opens up possibilities that any organization can take advantage of.


Digital transformation is achieved by using modern technology to radically improve the performance of business processes, applications, or even whole enterprises.

One of the available technologies that enable companies to go through a major performance shift is Node.js and its ecosystem. It is a tool that grants improvement opportunities that organizations should take advantage of:

  • Increased developer productivity,
  • DevOps or NoOps practices,
  • and shipping software to production in brief time using the proxy approach,

just to mention a few.

The 9 Areas of Digital Transformation

Digital Transformation projects can improve a company in nine main areas. The following elements were identified as a result of an MIT research on digital transformation, where they interviewed 157 executives from 50 companies (typically $1 billion or more in annual sales).

#1. Understanding your Customers Better

Companies are heavily investing in systems to understand specific market segments and geographies better. They have to figure out what leads to customer happiness and customer dissatisfaction.

Many enterprises are building analytics capabilities to better understand their customers. Information derived this way can be used for data-driven decisions.

#2. Achieving Top-Line Growth

Digital transformation can also be used to enhance in-person sales conversations. Instead of paper-based presentations or slides, salespeople can use great looking, interactive presentations, like tablet-based presentations.

Understanding customers better helps enterprises to transform and improve the sales experience with more personalized sales and customer service.

#3. Building Better Customer Touch Points

Customer service can be improved tremendously with new digital services. For example, by introducing new channels for the communication. Instead of going to a local branch of a business, customers can talk to support through Twitter of Facebook.

Self-service digital tools can be developed which both save time for the customer while saving money for the company.

#4. Process Digitization

With automation companies can focus their employees on more strategic tasks, innovation or creativity rather than repetitive efforts.

#5. Worker Enablement

Virtualization of individual work (the work process is separated from the location of work) have become enablers for knowledge sharing. Information and expertise is accessible in real-time for frontline employees.

#6. Data-Driven Performance Management

With the proper analytical capabilities, decisions can be made on real data and not on assumptions.

Digital transformation is changing the way how strategic decisions are made. With new tools strategic planning sessions can include more stakeholders, not just a small group.

#7. Digitally Extended Businesses

Many companies extend their physical offerings with digital ones. Examples include:

  • news outlets augmenting their print offering with digital content,
  • FMCG companies extending to e-commerce.

#8. New Digital Businesses

Companies not just extend their current offerings with digital transformation, but also coming up with new digital products that complement the traditional ones. Examples may include connected devices, like GPS trackers that can now report activity to the cloud and provide value to the customers through recommendations.

#9. Digital Globalization

Global shared services, like shared finance or HR enable organizations to build truly global operations.


The Digital Transformation Benefits of Implementing Node.js

Organizations are looking for the most effective way of digital transformation - among these companies, Node.js is becoming the de facto technology for building out digital capabilities.

Node.js is a JavaScript runtime built on Chrome's V8 JavaScript engine. Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient.

In other words: Node.js offers you the possibility to write servers using JavaScript with an incredible performance.

Increased Developer Productivity with Same-Language Stacks

When PayPal started using Node.js, they reported an 2x increase in productivity compared to the previous Java stack. How is that even possible?

NPM, the Node.js package manager, has an incredible amount of modules that can be used instantly. This saves a lot of development effort for the development team.

Secondly, as Node.js applications are written using JavaScript, front-end developers can also easily understand what's going on and make necessary changes.

This saves you valuable time again as developers will use the same language on the entire stack.

100% Business Availability, even with Extreme Load

Around 1.5 billion dollars are being spent online in the US on a single day on Black Friday, each year.

It is crucial that your site can keep up with the traffic. This is why Walmart, one of the biggest retailers is using Node.js to serve 500 million pageviews on Black Friday, without a hitch.

Fast Apps = Satisfied Customers

As your velocity increases because of the productivity gains, you can ship features/products sooner. Products that run faster result in better user experience.

Kissmetric's study showed that 40% of people abandon a website that takes more than 3 seconds to load, and 47% of consumers expect a web page to load in 2 seconds or less.

To read more on the benefits of using Node.js, you can download our Node.js is Enterprise Ready ebook.


Your Digital Transformation Roadmap with Node.js

As with most new technologies introduced to a company, it’s worth taking baby-steps first with Node.js as well. As a short framework for introducing Node.js, we recommend the following steps:

  • building a Node.js core team,
  • picking a small part of the application to be rewritten/extended using Node.js,
  • extending the scope of the project to the whole organization.

Step 1 - Building your Node.js Core Team

The core Node.js team will consist of people with JavaScript experience for both the backend and the frontend. It’s not crucial that the backend engineers have any Node.js experience, the important aspect is the vision they bring to the team.

Introducing Node.js is not just about JavaScript - it has to include members of the operations team as well, joining the core team.

The introduction of Node.js to an organization does not stop at excelling Node.js - it also means adding modern DevOps or NoOps practices, including but not limited to continuous integration and delivery.

Step 2 - Embracing The Proxy Approach

To incrementally replace old systems or to extend their functionality easily, your team can use the proxy approach.

For the features or functions you want to replace, create a small and simple Node.js application and proxy some of your load to the newly built Node.js application. This proxy does not necessarily have to be written in Node.js. With this approach, you can easily benefit from modularized, service-oriented architecture.

Another way to use proxies is to write them in Node.js and make them to talk with the legacy systems. This way you have the option to optimize the data sent being sent. PayPal was one of the first adopter of Node.js at scale, and they started with this proxy approach as well.

The biggest advantages of these solutions are that you can put Node.js into production in a short amount of time, measure your results, and learn from them.

Step 3 - Measure Node.js, Be Data-Driven

For the successful introduction of Node.js during a digital transformation project, it is crucial to set up a series of benchmarks to compare the results between the legacy system and the new Node.js applications. These data points can be response times, throughput or memory and CPU usage.

Orchestrating The Node.js Stack

As mentioned previously, introducing Node.js does not stop at excelling Node.js itself, but introducing continuous integration and delivery are crucial points as well.

Also, from an operations point of view, it is important to add containers to ship applications with confidence.

For orchestration, to operate the containers containing the Node.js applications we encourage companies to adopt Kubernetes, an open-source system for automating deployment, scaling, and management of containerized applications.

RisingStack and Digital Transformation with Node.js

RisingStack enables amazing companies to succeed with Node.js and related technologies to stay ahead of the competition. We provide professional Node.js development and consulting services from the early days of Node.js, and help companies like Lufthansa or Cisco to thrive with this technology.

10 Best Practices for Writing Node.js REST APIs

In this article we cover best practices for writing Node.js REST APIs, including topics like naming your routes, authentication, black-box testing & using proper cache headers for these resources.

One of the most popular use-cases for Node.js is to write RESTful APIs using it. Still, while we help our customers to find issues in their applications with Trace, our Node.js monitoring tool we constantly experience that developers have a lot of problems with REST APIs.

I hope these best-practices we use at RisingStack can help:

#1 - Use HTTP Methods & API Routes

Imagine, that you are building a Node.js RESTful API for creating, updating, retrieving or deleting users. For these operations HTTP already has the adequate toolset: POST, PUT, GET, PATCH or DELETE.

As a best practice, your API routes should always use nouns as resource identifiers. Speaking of the user's resources, the routing can look like this:

  • POST /user or PUT /user:/id to create a new user,
  • GET /user to retrieve a list of users,
  • GET /user/:id to retrieve a user,
  • PATCH /user/:id to modify an existing user record,
  • DELETE /user/:id to remove a user.

#2 - Use HTTP Status Codes Correctly

If something goes wrong while serving a request, you must set the correct status code for that in the response:

  • 2xx, if everything was okay,
  • 3xx, if the resource was moved,
  • 4xx, if the request cannot be fulfilled because of a client error (like requesting a resource that does not exist),
  • 5xx, if something went wrong on the API side (like an exception happened).

If you are using Express, setting the status code is as easy as res.status(500).send({error: 'Internal server error happened'}). Similarly with Restify: res.status(201).

For a full list, check the list of HTTP status codes

#3 - Use HTTP headers to Send Metadata

To attach metadata about the payload you are about to send, use HTTP headers. Headers like this can be information on:

  • pagination,
  • rate limiting,
  • or authentication.

A list of standardized HTTP headers can be found here.

If you need to set any custom metadata in your headers, it was a best practice to prefix them with X. For example, if you were using CSRF tokens, it was a common (but non-standard) way to name them X-Csrf-Token. However with RFC 6648 they got deprecated. New APIs should make their best effort to not use header names that can conflict with other applications. For example, OpenStack prefixes its headers with OpenStack:

OpenStack-Identity-Account-ID  
OpenStack-Networking-Host-Name  
OpenStack-Object-Storage-Policy  

Note that the HTTP standard does not define any size limit on the headers; however, Node.js (as of writing this article) imposes an 80KB size limit on the headers object for practical reasons.

" Don't allow the total size of the HTTP headers (including the status line) to exceed HTTP_MAX_HEADER_SIZE. This check is here to protect embedders against denial-of-service attacks where the attacker feeds us a never-ending header that the embedder keeps buffering."

From the Node.js HTTP parser

#4 - Pick the right framework for your Node.js REST API

It is important to pick the framework that suits your use-case the most.

Express, Koa or Hapi

Express, Koa and Hapi can be used to create browser applications, and as such, they support templating and rendering - just to name a few features. If your application needs to provide the user-facing side as well, it makes sense to go for them.

Restify

On the other hand, Restify is focusing on helping you build REST services. It exists to let you build "strict" API services that are maintainable and observable. Restify also comes with automatic DTrace support for all your handlers.

Restify is used in production in major applications like npm or Netflix.

#5 - Black-Box Test your Node.js REST APIs

One of the best ways to test your REST APIs is to treat them as black boxes.

Black-box testing is a method of testing where the functionality of an application is examined without the knowledge of its internal structures or workings. So none of the dependencies are mocked or stubbed, but the system is tested as a whole.

One of the modules that can help you with black-box testing Node.js REST APIs is supertest.

A simple test case which checks if a user is returned using the test runner mocha can be implemented like this:

const request = require('supertest')

describe('GET /user/:id', function() {  
  it('returns a user', function() {
    // newer mocha versions accepts promises as well
    return request(app)
      .get('/user')
      .set('Accept', 'application/json')
      .expect(200, {
        id: '1',
        name: 'John Math'
      }, done)
  })
})

You may ask: how does the data gets populated into the database which serves the REST API?

In general, it is a good approach to write your tests in a way that they make as few assumptions about the state of the system as possible. Still, in some scenarios you can find yourself in a spot when you need to know what is the state of the system exactly, so you can make assertions and achieve higher test coverage.

So based on your needs, you can populate the database with test data in one of the following ways:

  • run your black-box test scenarios on a known subset of production data,
  • populate the database with crafted data before the test cases are run.

Of course, black-box testing does not mean that you don't have to do unit testing, you still have to write unit tests for your APIs.


Node.js Monitoring and Debugging from the Experts of RisingStack

Improve your REST APIs with Trace
Learn more


#6 - Do JWT-Based, Stateless Authentication

As your REST APIs must be stateless, so does your authentication layer. For this, JWT (JSON Web Token) is ideal.

JWT consists of three parts:

  • Header, containing the type of the token and the hashing algorithm
  • Payload, containing the claims
  • Signature (JWT does not encrypt the payload, just signs it!)

Adding JWT-based authentication to your application is very straightforward:

const koa = require('koa')  
const jwt = require('koa-jwt')

const app = koa()

app.use(jwt({  
  secret: 'very-secret' 
}))

// Protected middleware
app.use(function *(){  
  // content of the token will be available on this.state.user
  this.body = {
    secret: '42'
  }
})

After that, the API endpoints are protected with JWT. To access the protected endpoints, you have to provide the token in the Authorization header field.

curl --header "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiYWRtaW4iOnRydWV9.TJVA95OrM7E2cBab30RMHrHDcEfxjoYZgeFONFh7HgQ" my-website.com  

One thing that you could notice is that the JWT module does not depend on any database layer. This is the case because all JWT tokens can be verified on their own, and they can also contain time to live values.

Also, you always have to make sure that all your API endpoints are only accessible through a secure connection using HTTPS.

In a previous article, we explained web authentication methods in details - I recommend to check it out!

#7 - Use Conditional Requests

Conditional requests are HTTP requests which are executed differently depending on specific HTTP headers. You can think of these headers as preconditions: if they are met, the requests will be executed in a different way.

These headers try to check whether a version of a resource stored on the server matches a given version of the same resource. Because of this reason, these headers can be:

  • the timestamp of the last modification,
  • or an entity tag, which differs for each version.

These headers are:

  • Last-Modified (to indicate when the resource was last modified),
  • Etag (to indicate the entity tag),
  • If-Modified-Since (used with the Last-Modified header),
  • If-None-Match (used with the Etag header),

Let's take a look at an example!

The client below did not have any previous versions of the doc resource, so neither the If-Modified-Since, nor the If-None-Match header was applied when the resource was sent. Then, the server responds with the Etag and Last-Modified headers properly set.

Node.js RESTfu API with conditional request, without previous versions

From the MDN Conditional request documentation

The client can set the If-Modified-Since and If-None-Match headers once it tries to request the same resource - since it has a version now. If the response would be the same, the server simply responds with the 304 - Not Modified status and does not send the resource again.

Node.js RESTfu API with conditional request, with previous versions

From the MDN Conditional request documentation

#8 - Embrace Rate Limiting

Rate limiting is used to control how many requests a given consumer can send to the API.

To tell your API users how many requests they have left, set the following headers:

  • X-Rate-Limit-Limit, the number of requests allowed in a given time interval
  • X-Rate-Limit-Remaining, the number of requests remaining in the same interval,
  • X-Rate-Limit-Reset, the time when the rate limit will be reset.

Most HTTP frameworks support it out of the box (or with plugins). For example, if you are using Koa, there is the koa-ratelimit package.

Note, that the time window can vary based on different API providers - for example, GitHub uses an hour for that, while Twitter 15 minutes.

#9 - Create a Proper API Documentation

You write APIs so others can use them, benefit from them. Providing an API documentation for your Node.js REST APIs are crucial.

The following open-source projects can help you with creating documentation for your APIs:

Alternatively, if you want to use a hosted products, you can go for Apiary.

#10 - Don't Miss The Future of APIs

In the past years, two major query languages for APIs arose - namely GraphQL from Facebook and Falcor from Netflix. But why do we even need them?

Imagine the following RESTful resource request:

/org/1/space/2/docs/1/collaborators?include=email&page=1&limit=10

This can get out of hand quite easily - as you'd like to get the same response format for all your models all the time. This is where GraphQL and Falcor can help.

About GraphQL

GraphQL is a query language for APIs and a runtime for fulfilling those queries with your existing data. GraphQL provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools. - Read more here.

About Falcor

Falcor is the innovative data platform that powers the Netflix UIs. Falcor allows you to model all your backend data as a single Virtual JSON object on your Node server. On the client, you work with your remote JSON object using familiar JavaScript operations like get, set, and call. If you know your data, you know your API. - Read more here.

Amazing REST APIs for Inspiration

If you are about to start developing a Node.js REST API or creating a new version of an older one, we have collected four real-life examples that are worth checking out:

I hope that now you have a better understanding of how APIs should be written using Node.js. Please let me know in the comments if you miss anything!

Yarn vs npm - The State of Node.js Package Managers

With the v7.4 release, npm 4 became the bundled, default package manager for Node.js. In the meantime, Facebook released their own package manager solution, called Yarn.

Let's take a look at the state of Node.js package managers, what they can do for you, and when you should pick which one!

Yarn - the new kid on the block

Fast, reliable and secure dependency management - this is the promise of Yarn, the new dependency manager created by the engineers of Facebook.

But can Yarn live up to the expectations?

Yarn - the node.js package manager

Installing Yarn

There are several ways of installing Yarn. If you have npm installed, you can just install Yarn with npm:

npm install yarn --global  

However, the recommended way by the Yarn team is to install it via your native OS package manager - if you are on a Mac, probably it will be brew:

brew update  
brew install yarn  

Yarn Under the Hood

Yarn has a lot of performance and security improvements under the hood. Let's see what these are!

Offline cache

When you install a package using Yarn (using yarn add packagename), it places the package on your disk. During the next install, this package will be used instead of sending an HTTP request to get the tarball from the registry.

Your cached module will be put into ~/.yarn-cache, and will be prefixed with the registry name, and postfixed with the modules version.

This means that if you install the 4.4.5 version of express with Yarn, it will be put into ~/.yarn-cache/npm-express-4.4.5.

Node.js Monitoring and Debugging from the Experts of RisingStack

Create performant applications using Trace
Learn more

Deterministic Installs

Yarn uses lockfiles (yarn.lock) and a deterministic install algorithm. We can say goodbye to the "but it works on my machine" bugs.

The lockfile looks like something like this:

Yarn lockfile

It contains the exact version numbers of all your dependencies - just like with an npm shrinkwrap file.

License checks

Yarn comes with a handy license checker, which can become really powerful in case you have to check the licenses of all the modules you depend on.

yarn licenses

Potential issues/questions

Yarn is still in its early days, so it’s no surprise that there are some questions arising when you start using it.

What’s going on with the default registry?

By default, the Yarn CLI uses a different registry, and not the original one: https://registry.yarnpkg.com. So far there is no explanation on why it does not use the same registry.

Does Facebook have plans to make incompatible API changes and split the community?

Contributing back to npm?

One the most logical questions that can come up when talking about Yarn is: Why don’t you talk with the CLI team at npm, and work together?

If the problem is speed, I am sure all npm users would like to get those improvements as well.

When we talk about deterministic installs, instead of coming up with a lockfile, the npm-shrinkwrap.json should have been fixed.

Why the strange versioning?

In the world of Node.js and npm, versions starts with 1.0.0.

At the time of writing this article, Yarn is at 0.18.1.

Is something missing to make Yarn stable? Does Yarn simply not follow semver?

npm 4

npm is the default package manager we all know, and it is bundled with each Node.js release since v7.4.

Updating npm

To start using npm version 4, you just have to update your current CLI version:

npm install npm -g  

At the time of writing this article, this command will install npm version 4.1.1, which was released on 12/11/2016. Let's see what changed in this version!

Changes since version 3

  • npm search is now reimplemented to stream results, and sorting is no longer supported,
  • npm scripts no longer prepend the path of the node executable used to run npm before running scripts,
  • prepublish has been deprecated - you should use prepare from now on,
  • npm outdated returns 1 if it finds outdated packages,
  • partial shrinkwraps are no longer supported - the npm-shrinkwrap.json is considered a complete manifest,
  • Node.js 0.10 and 0.12 are no longer supported,
  • npm doctor, which diagnose user's environment and let the user know some recommended solutions if they potentially have any problems related to npm

As you can see, the team at npm was quite busy as well - both npm and Yarn made great progress in the past months.

Conclusion

It is great to see a new, open-source npm client - no doubt, a lot of effort went into making Yarn great!

Hopefully, we will see the improvements of Yarn incorporated into npm as well, so both users will benefit from the improvements of the others.

Yarn vs. npm - Which one to pick?

If you are working on proprietary software, it does not really matter which one you use. With npm, you can use npm-shrinkwrap.js, while you can use yarn.lock with Yarn.

The team at Yarn published a great article on why lockfiles should be committed all the time, I recommend checking it out: https://yarnpkg.com/blog/2016/11/24/lockfiles-for-all


Node.js Interview Questions and Answers (2017 Edition)

Two years ago we published our first article on common Node.js Interview Questions and Answers. Since then a lot of things improved in the JavaScript and Node.js ecosystem, so it was time to update it.

Important Disclaimers

It is never a good practice to judge someone just by questions like these, but these can give you an overview of the person's experience in Node.js.

But obviously, these questions do not give you the big picture of someone's mindset and thinking.

I think that a real-life problem can show a lot more of a candidate's knowledge - so we encourage you to do pair programming with the developers you are going to hire.

Finally and most importantly: we are all humans, so make your hiring process as welcoming as possible. These questions are not meant to be used as "Questions & Answers" but just to drive the conversation.

Node.js Interview Questions for 2017

  • What is an error-first callback?
  • How can you avoid callback hells?
  • What are Promises?
  • What tools can be used to assure consistent style? Why is it important?
  • When should you npm and when yarn?
  • What's a stub? Name a use case!
  • What's a test pyramid? Give an example!
  • What's your favorite HTTP framework and why?
  • How can you secure your HTTP cookies against XSS attacks?
  • How can you make sure your dependencies are safe?

The Answers

What is an error-first callback?

Error-first callbacks are used to pass errors and data as well. You have to pass the error as the first parameter, and it has to be checked to see if something went wrong. Additional arguments are used to pass data.

fs.readFile(filePath, function(err, data) {  
  if (err) {
    // handle the error, the return is important here
    // so execution stops here
    return console.log(err)
  }
  // use the data object
  console.log(data)
})

How can you avoid callback hells?

There are lots of ways to solve the issue of callback hells:

What are Promises?

Promises are a concurrency primitive, first described in the 80s. Now they are part of most modern programming languages to make your life easier. Promises can help you better handle async operations.

An example can be the following snippet, which after 100ms prints out the result string to the standard output. Also, note the catch, which can be used for error handling. Promises are chainable.

new Promise((resolve, reject) => {  
  setTimeout(() => {
    resolve('result')
  }, 100)
})
  .then(console.log)
  .catch(console.error)

What tools can be used to assure consistent style? Why is it important?

When working in a team, consistent style is important, so team members can modify more projects easily, without having to get used to a new style each time.

Also, it can help eliminate programming issues using static analysis.

Tools that can help:

If you’d like to be even more confident, I suggest you to learn and embrace the JavaScript Clean Coding principles as well!

Node.js Monitoring and Debugging from the Experts of RisingStack

Create performant applications using Trace
Learn more

What's a stub? Name a use case!

Stubs are functions/programs that simulate the behaviors of components/modules. Stubs provide canned answers to function calls made during test cases.

An example can be writing a file, without actually doing so.

var fs = require('fs')

var writeFileStub = sinon.stub(fs, 'writeFile', function (path, data, cb) {  
  return cb(null)
})

expect(writeFileStub).to.be.called  
writeFileStub.restore()  

What's a test pyramid? Give an example!

A test pyramid describes the ratio of how many unit tests, integration tests and end-to-end test you should write.

An example for an HTTP API may look like this:

  • lots of low-level unit tests for models (dependencies are stubbed),
  • fewer integration tests, where you check how your models interact with each other (dependencies are not stubbed),
  • less end-to-end tests, where you call your actual endpoints (dependencies are not stubbed).

What's your favorite HTTP framework and why?

There is no right answer for this. The goal here is to understand how deeply one knows the framework she/he uses. Tell what are the pros and cons of picking that framework.

When are background/worker processes useful? How can you handle worker tasks?

Worker processes are extremely useful if you'd like to do data processing in the background, like sending out emails or processing images.

There are lots of options for this like RabbitMQ or Kafka.

How can you secure your HTTP cookies against XSS attacks?

XSS occurs when the attacker injects executable JavaScript code into the HTML response.

To mitigate these attacks, you have to set flags on the set-cookie HTTP header:

  • HttpOnly - this attribute is used to help prevent attacks such as cross-site scripting since it does not allow the cookie to be accessed via JavaScript.
  • secure - this attribute tells the browser to only send the cookie if the request is being sent over HTTPS.

So it would look something like this: Set-Cookie: sid=<cookie-value>; HttpOnly. If you are using Express, with express-cookie session, it is working by default.

How can you make sure your dependencies are safe?

When writing Node.js applications, ending up with hundreds or even thousands of dependencies can easily happen.
For example, if you depend on Express, you depend on 27 other modules directly, and of course on those dependencies' as well, so manually checking all of them is not an option!

The only option is to automate the update / security audit of your dependencies. For that there are free and paid options:

Node.js Interview Puzzles

The following part of the article is useful if you’d like to prepare for an interview that involves puzzles, or tricky questions.

What's wrong with the code snippet?

new Promise((resolve, reject) => {  
  throw new Error('error')
}).then(console.log)

The Solution

As there is no catch after the then. This way the error will be a silent one, there will be no indication of an error thrown.

To fix it, you can do the following:

new Promise((resolve, reject) => {  
  throw new Error('error')
}).then(console.log).catch(console.error)

If you have to debug a huge codebase, and you don't know which Promise can potentially hide an issue, you can use the unhandledRejection hook. It will print out all unhandled Promise rejections.

process.on('unhandledRejection', (err) => {  
  console.log(err)
})

What's wrong with the following code snippet?

function checkApiKey (apiKeyFromDb, apiKeyReceived) {  
  if (apiKeyFromDb === apiKeyReceived) {
    return true
  }
  return false
}

The Solution

When you compare security credentials it is crucial that you don't leak any information, so you have to make sure that you compare them in fixed time. If you fail to do so, your application will be vulnerable to timing attacks.

But why does it work like that?

V8, the JavaScript engine used by Node.js, tries to optimize the code you run from a performance point of view. It starts comparing the strings character by character, and once a mismatch is found, it stops the comparison operation. So the longer the attacker has right from the password, the more time it takes.

To solve this issue, you can use the npm module called cryptiles.

function checkApiKey (apiKeyFromDb, apiKeyReceived) {  
  return cryptiles.fixedTimeComparison(apiKeyFromDb, apiKeyReceived)
}

What's the output of following code snippet?

Promise.resolve(1)  
  .then((x) => x + 1)
  .then((x) => { throw new Error('My Error') })
  .catch(() => 1)
  .then((x) => x + 1)
  .then((x) => console.log(x))
  .catch(console.error)

The Answer

The short answer is 2 - however with this question I'd recommend asking the candidates to explain what will happen line-by-line to understand how they think. It should be something like this:

  1. A new Promise is created, that will resolve to 1.
  2. The resolved value is incremented with 1 (so it is 2 now), and returned instantly.
  3. The resolved value is discarded, and an error is thrown.
  4. The error is discarded, and a new value (1) is returned.
  5. The execution did not stop after the catch, but before the exception was handled, it continued, and a new, incremented value (2) is returned.
  6. The value is printed to the standard output.
  7. This line won't run, as there was no exception.

A day may work better than questions

Spending at least half a day with your possible next hire is worth more than a thousand of these questions.

Once you do that, you will better understand if the candidate is a good cultural fit for the company and has the right skill set for the job.

Do you miss anything? Let us know!

What was the craziest interview question you had to answer? What's your favorite question / puzzle to ask? Let us know in the comments! :)