Expert Node.js Support
Learn more

Node.js

Node.js War Stories: Debugging Issues in Production

Node.js War Stories: Debugging Issues in Production

In this article, you can read stories from Netflix, RisingStack & nearForm about Node.js issues in production - so you can learn from our mistakes and avoid repeating them. You'll also learn what methods we used to debug these Node.js issues.

Special shoutout to Yunong Xiao of Netflix, Matteo Collina of nearForm & Shubhra Kar from Strongloop for helping us with their insights for this post!


At RisingStack, we have accumulated a tremendous experience of running Node apps in production in the past 4 years - thanks to our Node.js consulting, training and development business.

As well as the Node teams at Netflix & nearForm we picked up the habit of always writing thorough postmortems, so the whole team (and now the whole world) could learn from the mistakes we made.

Netflix & Debugging Node: Know your Dependencies

Let's start with a slowdown story from Yunong Xiao, which happened with our friends at Netflix.

The trouble started with the Netflix team noticing that their applications response time increased progressively - some of their endpoints' latency increased with 10ms every hour.

This was also reflected in the growing CPU usage.

Netflix debugging Nodejs in production with the Request latency graph Request latencies for each region over time - photo credit: Netflix

At first, they started to investigate whether the request handler is responsible for slowing things down.

After testing it in isolation, it turned out that the request handler had a constant response time around 1ms.

So the problem was not that, and they started to suspect that probably it's deeper in the stack.

The next thing Yunong & the Netflix team tried are CPU flame graphs and Linux Perf Events.

Flame graph of Netflix Nodejs slowdown Flame graph or the Netflix slowdown - photo credit: Netflix

What you can see in the flame graph above is that

  • it has high stacks (which means lot of function calls)
  • and the boxes are wide (meaning we are spending quite some time in those functions).

After further inspection, the team found that Express's router.handle and router.handle.next has lots of references.

The Express.js source code reveals a couple of interesting tidbits:

  • Route handlers for all endpoints are stored in one global array.
  • Express.js recursively iterates through and invokes all handlers until it finds the right route handler.

Before revealing the solution of this mystery, we have to get one more detail:

Netflix's codebase contained a periodical code that ran every 6 minutes and grabbed new route configs from an external resource and updated the application's route handlers to reflect the changes.

This was done by deleting old handlers and adding new ones. Accidentally, it also added the same static handler all over again - even before the API route handlers. As it turned out, this caused the extra 10ms response time hourly.

Takeaways from Netflix's Issue

  • Always know your dependencies - first, you have to fully understand them before going into production with them.
  • Observability is key - flame graphs helped the Netflix engineering team to get to the bottom of the issue.

Read the full story here: Node.js in Flames.


Expert help when you need it the most

Commercial Node.js Support by RisingStack
Learn more


RisingStack CTO: "Crypto takes time"

You may have already heard to story of how we broke down the monolithic infrastructure of Trace (our Node.js monitoring solution) into microservices from our CTO, Peter Marton.

The issue we'll talk about now is a slowdown which affected Trace in production:

As the very first versions of Trace ran on a PaaS, it used the public cloud to communicate with other services of ours.

To ensure the integrity of our requests, we decided to sign all of them. To do so, we went with Joyent's HTTP signing library. What's really great about it, is that the request module supports HTTP signature out of the box.

This solution was not only expensive, but it also had a bad impact on our response times.

network delay in nodejs request visualized by trace The network delay built up our response times - photo: Trace

As you can see on the graph above, the given endpoint had a response time of 180ms, however from that amount, 100ms was just the network delay between the two services alone.

As the first step, we migrated from the PaaS provider to use Kubernetes. We expected that our response times would be a lot better, as we can leverage internal networking.

We were right - our latency improved.

However, we expected better results - and a lot bigger drop in our CPU usage. The next step was to do CPU profiling, just like the guys at Netflix:

crypto sign function taking up cpu time

As you can see on the screenshot, the crypto.sign function takes up most of the CPU time, by consuming 10ms on each request. To solve this, you have two options:

  • if you are running in a trusted environment, you can drop request signing,
  • if you are in an untrusted environment, you can scale up your machines to have stronger CPUs.

Takeaways from Peter Marton

  • Latency in-between your services has a huge impact on user experience - whenever you can, leverage internal networking.
  • Crypto can take a LOT of time.

nearForm: Don't block the Node.js Event Loop

React is more popular than ever. Developers use it for both the frontend and the backend, or they even take a step further and use it to build isomorphic JavaScript applications.

However, rendering React pages can put some heavy load on the CPU, as rendering complex React components is CPU bound.

When your Node.js process is rendering, it blocks the event loop because of its synchronous nature.

As a result, the server can become entirely unresponsive - requests accumulate, which all puts load on the CPU.

What can be even worse is that even those requests will be served which no longer have a client - still putting load on the Node.js application, as Matteo Collina of nearForm explains.

It is not just React, but string operations in general. If you are building JSON REST APIs, you should always pay attention to JSON.parse and JSON.stringify.

As Shubhra Kar from Strongloop (now Joyent) explained, parsing and stringifying huge payloads can take a lot of time as well (and blocking the event loop in the meantime).

function requestHandler(req, res) {  
  const body = req.rawBody
  let parsedBody
  try {
    parsedBody = JSON.parse(body)
  }
  catch(e) {
     res.end(new Error('Error parsing the body'))
  }
  res.end('Record successfully received')
}

Simple request handler

The example above shows a simple request handler, which just parses the body. For small payloads, it works like a charm - however, if the JSON's size can be measured in megabytes, the execution time can be seconds instead of milliseconds. The same applies for JSON.stringify.

To mitigate these issues, first, you have to know about them. For that, you can use Matteo's loopbench module, or Trace's event loop metrics feature.

With loopbench, you can return a status code of 503 to the load balancer, if the request cannot be fulfilled. To enable this feature, you have to use the instance.overLimit option. This way ELB or NGINX can retry it on a different backend, and the request may be served.

Once you know about the issue and understand it, you can start working on fixing it - you can do it either by leveraging Node.js streams or by tweaking the architecture you are using.

Takeaways from nearForm

  • Always pay attention to CPU bound operations - the more you have, to more pressure you put on your event loop.
  • String operations are CPU-heavy operations

Debugging Node.js Issues in Production

I hope these examples from Netflix, RisingStack & nearForm will help you to debug your Node.js apps in Production.

If you'd like to learn more, I recommend checking out these recent posts which will help you to deepen your Node knowledge:

If you have any questions, please let us know in the comments!

Yarn vs npm - The State of Node.js Package Managers

Yarn vs npm - The State of Node.js Package Managers

With the v7.4 release, npm 4 became the bundled, default package manager for Node.js. In the meantime, Facebook released their own package manager solution, called Yarn.

Let's take a look at the state of Node.js package managers, what they can do for you, and when you should pick which one!

Yarn - the new kid on the block

Fast, reliable and secure dependency management - this is the promise of Yarn, the new dependency manager created by the engineers of Facebook.

But can Yarn live up to the expectations?

Yarn - the node.js package manager

Installing Yarn

There are several ways of installing Yarn. If you have npm installed, you can just install Yarn with npm:

npm install yarn --global  

However, the recommended way by the Yarn team is to install it via your native OS package manager - if you are on a Mac, probably it will be brew:

brew update  
brew install yarn  

Yarn Under the Hood

Yarn has a lot of performance and security improvements under the hood. Let's see what these are!

Offline cache

When you install a package using Yarn (using yarn add packagename), it places the package on your disk. During the next install, this package will be used instead of sending an HTTP request to get the tarball from the registry.

Your cached module will be put into ~/.yarn-cache, and will be prefixed with the registry name, and postfixed with the modules version.

This means that if you install the 4.4.5 version of express with Yarn, it will be put into ~/.yarn-cache/npm-express-4.4.5.

Node.js Monitoring and Debugging from the Experts of RisingStack

Create performant applications using Trace
Learn more

Deterministic Installs

Yarn uses lockfiles (yarn.lock) and a deterministic install algorithm. We can say goodbye to the "but it works on my machine" bugs.

The lockfile looks like something like this:

Yarn lockfile

It contains the exact version numbers of all your dependencies - just like with an npm shrinkwrap file.

License checks

Yarn comes with a handy license checker, which can become really powerful in case you have to check the licenses of all the modules you depend on.

yarn licenses

Potential issues/questions

Yarn is still in its early days, so it’s no surprise that there are some questions arising when you start using it.

What’s going on with the default registry?

By default, the Yarn CLI uses a different registry, and not the original one: https://registry.yarnpkg.com. So far there is no explanation on why it does not use the same registry.

Does Facebook have plans to make incompatible API changes and split the community?

Contributing back to npm?

One the most logical questions that can come up when talking about Yarn is: Why don’t you talk with the CLI team at npm, and work together?

If the problem is speed, I am sure all npm users would like to get those improvements as well.

When we talk about deterministic installs, instead of coming up with a lockfile, the npm-shrinkwrap.json should have been fixed.

Why the strange versioning?

In the world of Node.js and npm, versions starts with 1.0.0.

At the time of writing this article, Yarn is at 0.18.1.

Is something missing to make Yarn stable? Does Yarn simply not follow semver?

npm 4

npm is the default package manager we all know, and it is bundled with each Node.js release since v7.4.

Updating npm

To start using npm version 4, you just have to update your current CLI version:

npm install npm -g  

At the time of writing this article, this command will install npm version 4.1.1, which was released on 12/11/2016. Let's see what changed in this version!

Changes since version 3

  • npm search is now reimplemented to stream results, and sorting is no longer supported,
  • npm scripts no longer prepend the path of the node executable used to run npm before running scripts,
  • prepublish has been deprecated - you should use prepare from now on,
  • npm outdated returns 1 if it finds outdated packages,
  • partial shrinkwraps are no longer supported - the npm-shrinkwrap.json is considered a complete manifest,
  • Node.js 0.10 and 0.12 are no longer supported,
  • npm doctor, which diagnose user's environment and let the user know some recommended solutions if they potentially have any problems related to npm

As you can see, the team at npm was quite busy as well - both npm and Yarn made great progress in the past months.

Conclusion

It is great to see a new, open-source npm client - no doubt, a lot of effort went into making Yarn great!

Hopefully, we will see the improvements of Yarn incorporated into npm as well, so both users will benefit from the improvements of the others.

Yarn vs. npm - Which one to pick?

If you are working on proprietary software, it does not really matter which one you use. With npm, you can use npm-shrinkwrap.js, while you can use yarn.lock with Yarn.

The team at Yarn published a great article on why lockfiles should be committed all the time, I recommend checking it out: https://yarnpkg.com/blog/2016/11/24/lockfiles-for-all


Node.js Interview Questions and Answers (2017 Edition)

Node.js Interview Questions and Answers (2017 Edition)

Two years ago we published our first article on common Node.js Interview Questions and Answers. Since then a lot of things improved in the JavaScript and Node.js ecosystem, so it was time to update it.

Important Disclaimers

It is never a good practice to judge someone just by questions like these, but these can give you an overview of the person's experience in Node.js.

But obviously, these questions do not give you the big picture of someone's mindset and thinking.

I think that a real-life problem can show a lot more of a candidate's knowledge - so we encourage you to do pair programming with the developers you are going to hire.

Finally and most importantly: we are all humans, so make your hiring process as welcoming as possible. These questions are not meant to be used as "Questions & Answers" but just to drive the conversation.

Node.js Interview Questions for 2017

  • What is an error-first callback?
  • How can you avoid callback hells?
  • What are Promises?
  • What tools can be used to assure consistent style? Why is it important?
  • When should you npm and when yarn?
  • What's a stub? Name a use case!
  • What's a test pyramid? Give an example!
  • What's your favorite HTTP framework and why?
  • How can you secure your HTTP cookies against XSS attacks?
  • How can you make sure your dependencies are safe?

The Answers

What is an error-first callback?

Error-first callbacks are used to pass errors and data as well. You have to pass the error as the first parameter, and it has to be checked to see if something went wrong. Additional arguments are used to pass data.

fs.readFile(filePath, function(err, data) {  
  if (err) {
    // handle the error, the return is important here
    // so execution stops here
    return console.log(err)
  }
  // use the data object
  console.log(data)
})

How can you avoid callback hells?

There are lots of ways to solve the issue of callback hells:

What are Promises?

Promises are a concurrency primitive, first described in the 80s. Now they are part of most modern programming languages to make your life easier. Promises can help you better handle async operations.

An example can be the following snippet, which after 100ms prints out the result string to the standard output. Also, note the catch, which can be used for error handling. Promises are chainable.

new Promise((resolve, reject) => {  
  setTimeout(() => {
    resolve('result')
  }, 100)
})
  .then(console.log)
  .catch(console.error)

What tools can be used to assure consistent style? Why is it important?

When working in a team, consistent style is important, so team members can modify more projects easily, without having to get used to a new style each time.

Also, it can help eliminate programming issues using static analysis.

Tools that can help:

If you’d like to be even more confident, I suggest you to learn and embrace the JavaScript Clean Coding principles as well!

Node.js Monitoring and Debugging from the Experts of RisingStack

Create performant applications using Trace
Learn more

What's a stub? Name a use case!

Stubs are functions/programs that simulate the behaviors of components/modules. Stubs provide canned answers to function calls made during test cases.

An example can be writing a file, without actually doing so.

var fs = require('fs')

var writeFileStub = sinon.stub(fs, 'writeFile', function (path, data, cb) {  
  return cb(null)
})

expect(writeFileStub).to.be.called  
writeFileStub.restore()  

What's a test pyramid? Give an example!

A test pyramid describes the ratio of how many unit tests, integration tests and end-to-end test you should write.

An example for an HTTP API may look like this:

  • lots of low-level unit tests for models (dependencies are stubbed),
  • fewer integration tests, where you check how your models interact with each other (dependencies are not stubbed),
  • less end-to-end tests, where you call your actual endpoints (dependencies are not stubbed).

What's your favorite HTTP framework and why?

There is no right answer for this. The goal here is to understand how deeply one knows the framework she/he uses. Tell what are the pros and cons of picking that framework.

When are background/worker processes useful? How can you handle worker tasks?

Worker processes are extremely useful if you'd like to do data processing in the background, like sending out emails or processing images.

There are lots of options for this like RabbitMQ or Kafka.

How can you secure your HTTP cookies against XSS attacks?

XSS occurs when the attacker injects executable JavaScript code into the HTML response.

To mitigate these attacks, you have to set flags on the set-cookie HTTP header:

  • HttpOnly - this attribute is used to help prevent attacks such as cross-site scripting since it does not allow the cookie to be accessed via JavaScript.
  • secure - this attribute tells the browser to only send the cookie if the request is being sent over HTTPS.

So it would look something like this: Set-Cookie: sid=<cookie-value>; HttpOnly. If you are using Express, with express-cookie session, it is working by default.

How can you make sure your dependencies are safe?

When writing Node.js applications, ending up with hundreds or even thousands of dependencies can easily happen.
For example, if you depend on Express, you depend on 27 other modules directly, and of course on those dependencies' as well, so manually checking all of them is not an option!

The only option is to automate the update / security audit of your dependencies. For that there are free and paid options:

Node.js Interview Puzzles

The following part of the article is useful if you’d like to prepare for an interview that involves puzzles, or tricky questions.

What's wrong with the code snippet?

new Promise((resolve, reject) => {  
  throw new Error('error')
}).then(console.log)

The Solution

As there is no catch after the then. This way the error will be a silent one, there will be no indication of an error thrown.

To fix it, you can do the following:

new Promise((resolve, reject) => {  
  throw new Error('error')
}).then(console.log).catch(console.error)

If you have to debug a huge codebase, and you don't know which Promise can potentially hide an issue, you can use the unhandledRejection hook. It will print out all unhandled Promise rejections.

process.on('unhandledRejection', (err) => {  
  console.log(err)
})

What's wrong with the following code snippet?

function checkApiKey (apiKeyFromDb, apiKeyReceived) {  
  if (apiKeyFromDb === apiKeyReceived) {
    return true
  }
  return false
}

The Solution

When you compare security credentials it is crucial that you don't leak any information, so you have to make sure that you compare them in fixed time. If you fail to do so, your application will be vulnerable to timing attacks.

But why does it work like that?

V8, the JavaScript engine used by Node.js, tries to optimize the code you run from a performance point of view. It starts comparing the strings character by character, and once a mismatch is found, it stops the comparison operation. So the longer the attacker has right from the password, the more time it takes.

To solve this issue, you can use the npm module called cryptiles.

function checkApiKey (apiKeyFromDb, apiKeyReceived) {  
  return cryptiles.fixedTimeComparison(apiKeyFromDb, apiKeyReceived)
}

What's the output of following code snippet?

Promise.resolve(1)  
  .then((x) => x + 1)
  .then((x) => { throw new Error('My Error') })
  .catch(() => 1)
  .then((x) => x + 1)
  .then((x) => console.log(x))
  .catch(console.error)

The Answer

The short answer is 2 - however with this question I'd recommend asking the candidates to explain what will happen line-by-line to understand how they think. It should be something like this:

  1. A new Promise is created, that will resolve to 1.
  2. The resolved value is incremented with 1 (so it is 2 now), and returned instantly.
  3. The resolved value is discarded, and an error is thrown.
  4. The error is discarded, and a new value (1) is returned.
  5. The execution did not stop after the catch, but before the exception was handled, it continued, and a new, incremented value (2) is returned.
  6. The value is printed to the standard output.
  7. This line won't run, as there was no exception.

A day may work better than questions

Spending at least half a day with your possible next hire is worth more than a thousand of these questions.

Once you do that, you will better understand if the candidate is a good cultural fit for the company and has the right skill set for the job.

Do you miss anything? Let us know!

What was the craziest interview question you had to answer? What's your favorite question / puzzle to ask? Let us know in the comments! :)


Node.js Best Practices - How to Become a Better Developer in 2017

Node.js Best Practices - How to Become a Better Developer in 2017

A year ago we wrote a post on How to Become a Better Node.js Developer in 2016 which was a huge success - so we thought now it is time to revisit the topics and prepare for 2017!

In this article, we will go through the most important Node.js best practices for 2017, topics that you should care about and educate yourself in. Let’s start!

Node.js Best Practices for 2017

Use ES2015

Last year we advised you to use ES2015 - however, a lot has changed since.

Back then, Node.js v4 was the LTS version, and it had support for 57% of the ES2015 functionality. A year passed and ES2015 support grew to 99% with Node v6.

If you are on the latest Node.js LTS version you don't need babel anymore to use the whole feature set of ES2015. But even with this said, on the client side you’ll probably still need it!

For more information on which Node.js version supports which ES2015 features, I'd recommend checking out node.green.

Use Promises

Promises are a concurrency primitive, first described in the 80s. Now they are part of most modern programming languages to make your life easier.

Imagine the following example code that reads a file, parses it, and prints the name of the package. Using callbacks, it would look something like this:

fs.readFile('./package.json', 'utf-8', function (err, data) {  
  if (err) {
    return console.log(err)
  }

  try {
    JSON.parse(data)
  } catch (ex) {
    return console.log(ex)
  }
  console.log(data.name)
})

Wouldn't it be nice to rewrite the snippet into something more readable? Promises help you with that:

fs.readFileAsync('./package.json').then(JSON.parse).then((data) => {  
  console.log(data.name)
})
.catch((e) => {
  console.error('error reading/parsing file', e)
})

Of course, for now, the fs API does not have an readFileAsync that returns a Promise. To make it work, you have to wrap it with a module like promisifyAll.

Use the JavaScript Standard Style

When it comes to code style, it is crucial to have a company-wide standard, so when you have to change projects, you can be productive starting from day zero, without having to worry about building the build because of different presets.

At RisingStack we have incorporated the JavaScript Standard Style in all of our projects.

Node.js best practices - The Standard JS Logo

With Standard, there is no decisions to make, no .eslintrc, .jshintrc, or .jscsrc files to manage. It just works. You can find the Standard rules here.



Need help with enterprise-grade Node.js Development?
Hire the experts of RisingStack!


Use Docker - Containers are Production Ready in 2017!

You can think of Docker images as deployment artifacts - Docker containers wrap up a piece of software in a complete filesystem that contains everything it needs to run: code, runtime, system tools, system libraries – anything you can install on a server.

But why should you start using Docker?

  • it enables you to run your applications in isolation,
  • as a conscience, it makes your deployments more secure,
  • Docker images are lightweight,
  • they enable immutable deployments,
  • and with them, you can mirror production environments locally.

To get started with Docker, head over to the official getting started tutorial. Also, for orchestration we recommend checking out our Kubernetes best practices article.

Monitor your Applications

If something breaks in your Node.js application, you should be the first one to know about it, not your customers.

One of the newer open-source solutions is Prometheus that can help you achieve this. Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud. The only downside of Prometheus is that you have to set it up for you and host it for yourself.

If you are looking for on out-of-the-box solution with support, Trace by RisingStack is a great solution developed by us.

Trace will help you with

  • alerting,
  • memory and CPU profiling in production systems,
  • distributed tracing and error searching,
  • performance monitoring,
  • and keeping your npm packages secure!

Node.js Best Practices for 2017 - Use Trace and Profling

Use Messaging for Background Processes

If you are using HTTP for sending messages, then whenever the receiving party is down, all your messages are lost. However, if you pick a persistent transport layer, like a message queue to send messages, you won't have this problem.

If the receiving service is down, the messages will be kept, and can be processed later. If the service is not down, but there is an issue, processing can be retried, so no data gets lost.

An example: you'd like to send out thousands of emails. In this case, you would just have to put some basic information like the target email address and the first name, and a background worker could easily put together the email's content and send them out.

What's really great about this approach is that you can scale it whenever you want, and no traffic will be lost. If you see that there are millions of emails to be sent out, you can add extra workers, and they can consume the very same queue.

You have lots of options for messaging queues:

Use the Latest LTS Node.js version

To get the best of the two worlds (stability and new features) we recommend using the latest LTS (long-term support) version of Node.js. As of writing this article, it is version 6.9.2.

To easily switch Node.js version, you can use nvm. Once you installed it, switching to LTS takes only two commands:

nvm install 6.9.2  
nvm use 6.9.2  


Use Semantic Versioning

We conducted a Node.js Developer Survey a few months ago, which allowed us to get some insights on how people use semantic versioning.

Unfortunately, we found out that only 71% of our respondents uses semantic versioning when publishing/consuming modules. This number should be higher in our opinion - everyone should use it! Why? Because updating packages without semver can easily break Node.js apps.

Node.js Best Practices for 2017 - Semantic versioning survey results

Versioning your application / modules is critical - your consumers must know if a new version of a module is published and what needs to be done on their side to get the new version.

This is where semantic versioning comes into the picture. Given a version number MAJOR.MINOR.PATCH, increment the:

  • MAJOR version when you make incompatible API changes,
  • MINOR version when you add functionality (without breaking the API), and
  • PATCH version when you make backwards-compatible bug fixes.

npm also uses SemVer when installing your dependencies, so when you publish modules, always make sure to respect it. Otherwise, you can break others applications!

Secure Your Applications

Securing your users and customers data should be one of your top priorities in 2017. In 2016 alone, hundreds of millions of user accounts were compromised as a result of low security.

To get started with Node.js Security, read our Node.js Security Checklist, which covers topics like:

  • Security HTTP Headers,
  • Brute Force Protection,
  • Session Management,
  • Insecure Dependencies,
  • or Data Validation.

After you’ve embraced the basics, check out my Node Interactive talk on Surviving Web Security with Node.js!

Learn Serverless

Serverless started with the introduction of AWS Lambda. Since then it is growing fast, with a blooming open-source community.

In the next years, serverless will become a major factor for building new applications. If you'd like to stay on the edge, you should start learning it today.

One of the most popular solutions is the Serverless Framework, which helps in deploying AWS Lambda functions.

Attend and Speak at Conferences and Meetups

Attending conferences and meetups are great ways to learn about new trends, use-cases or best practices. Also, it is a great forum to meet new people.

To take it one step forward, I'd like to encourage you to speak at one of these events as well!

As public speaking is tough, and “imagine everyone's naked” is the worst advice, I'd recommend checking out speaking.io for tips on public speaking!

Become a better Node.js developer in 2017

As 2017 will be the year of Node.js, we’d like to help you getting the most out of it!

We just launched a new study program called "Owning Node.js" which helps you to become confident in:

  • Async Programming with Node.js
  • Creating servers with Express
  • Using Databases with Node
  • Project Structuring and building scalable apps

I want to learn more !


If you have any questions about the article, find me in the comments section!

The 10 Most Important Node.js Articles of 2016

The 10 Most Important Node.js Articles of 2016

2016 was an exciting year for Node.js developers. I mean - just take a look at this picture:

Every Industry has adopted Node.js

Looking back through the 6-year-long history of Node.js, we can tell that our favorite framework has finally matured to be used by the greatest enterprises, from all around the world, in basically every industry.

Another great news is that Node.js is the biggest open source platform ever - with 15 million+ downloads/month and more than a billion package downloads/week. Contributions have risen to the top as well since now we have more than 1,100 developers who built Node.js into the platform it is now.

To summarize this year, we collected the 10 most important articles we recommend to read. These include the biggest scandals, events, and improvements surrounding Node.js in 2016.

Let's get started!

#1: How one developer broke Node, Babel and thousands of projects in 11 lines of JavaScript

Programmers were shocked looking at broken builds and failed installations after Azer Koçulu unpublished more than 250 of his modules from NPM in March 2016 - breaking thousands of modules, including Node and Babel.

Koçulu deleted his code because one of his modules was called Kik - same as the instant messaging app - so the lawyers of Kik claimed brand infringement, and then NPM took the module away from him.

"This situation made me realize that NPM is someone’s private land where corporate is more powerful than the people, and I do open source because Power To The People." - Azer Koçulu

One of Azer's modules was left-pad, which padded out the lefthand-side of strings with zeroes or spaces. Unfortunately, 1000s of modules depended on it..

You can read the rest of this story in The Register's great article, with updates on the outcome of this event.

#2: Facebook partners with Google to launch a new JavaScript package manager

In October 2016, Facebook & Google launched Yarn, a new package manager for JavaScript.

The reason? There were a couple of fundamental problems with npm for Facebooks’s workflow.

  • At Facebook’s scale npm didn’t quite work well.
  • npm slowed down the company’s continuous integration workflow.
  • Checking all of the modules into a repository was also inefficient.
  • npm is, by design, nondeterministic — yet Facebook’s engineers needed a consistent and reliable system for their DevOps workflow.

Instead of hacking around npm’s limitations, Facebook wrote Yarn from the scratch:

  • Yarn does a better job at caching files locally.
  • Yarn is also able to parallelize some of its operations, which speeds up the install process for new modules.
  • Yarn uses lockfiles and a deterministic install algorithm to create consistent file structures across machines.
  • For security reasons, Yarn does not allow developers who write packages to execute other code that’s needed as part of the install process.

Yarn, which promises to even give developers that don’t work at Facebook’s scale a major performance boost, still uses the npm registry and is essentially a drop-in replacement for the npm client.

You can read the full article with the details on TechCrunch.

#3: Debugging Node.js with Chrome DevTools

New support for Node.js debuggability landed in Node.js master in May.

To use the new debugging tool, you have to

  • nvm install node
  • Run Node with the inspect flag: node --inspect index.js
  • Open the provided URL you got, starting with “chrome-devtools://..”

Read the great tutorial from Paul Irish to get all the features and details right!

#4: How I built an app with 500,000 users in 5 days on a $100 server

Jonathan Zarra, the creator of GoChat for Pokémon GO reached 1 million users in 5 days. Zarra had a hard time paying for the servers (around $4,000 / month) that were necessary to host 1M active users.

He never thought to get this many users. He built this app as an MVP, caring about scalability later. He built it to fail.

Zarra was already talking to VCs to grow and monetize his app. Since he built the app as an MVP, he thought he can care about scalability later.

He was wrong.

Thanks to it's poor design, GoChat was unable to scale to this much users, and went down. A lot of users lost with a lot of money spent.

500,000 users in 5 days on $100/month server

Erik Duindam, the CTO of Unboxd has been designing and building web platforms for hundreds of millions of active users throughout his whole life.

Frustrated by the poor design and sad fate of Zarra's GoChat, Erik decided to build his own solution, GoSnaps: The Instagram/Snapchat for Pokémon GO.

Erik was able to build a scalable MVP with Node.js in 24 hours, which could easily handle 500k unique users.

The whole setup ran on one medium Google Cloud server of $100/month, plus (cheap) Google Cloud Storage for the storage of images - and it was still able to perform exceptionally well.

GoSnap - The Node.js MVP that can Scale

How did he do it? Well, you can read the full story for the technical details:


#5: Getting Started with Node.js - The Node Hero Tutorial Series

The aim of the Node Hero tutorial series is to help novice developers to get started with Node.js and deliver software products with it!

Node Hero - Getting started with Node.js

You can find the full table of contents below:

  1. Getting started with Node.js
  2. Using NPM
  3. Understanding async programming
  4. Your first Node.js HTTP server
  5. Node.js database tutorial
  6. Node.js request module tutorial
  7. Node.js project structure tutorial
  8. Node.js authentication using Passport.js
  9. Node.js unit testing tutorial
  10. Debugging Node.js applications
  11. Node.js Security Tutorial
  12. Deploying Node.js application to a PaaS
  13. Monitoring Node.js Applications

#6: Using RabbitMQ & AMQP for Distributed Work Queues in Node.js

This tutorial helps you to use RabbitMQ to coordinate work between work producers and work consumers.

Unlike Redis, RabbitMQ's sole purpose is to provide a reliable and scalable messaging solution with many features that are not present or hard to implement in Redis.

RabbitMQ is a server that runs locally, or in some node on the network. The clients can be work producers, work consumers or both, and they will talk to the server using a protocol named Advanced Messaging Queueing Protocol (AMQP).

You can read the full tutorial here.

#7: Node.js, TC-39, and Modules

James M Snell, IBM Technical Lead for Node.js attended his first TC-39 meeting in late September.

The reason?

One of the newer JavaScript language features defined by TC-39 — namely, Modules — has been causing the Node.js core team a bit of trouble.

James and Bradley Farias (@bradleymeck) have been trying to figure out how to best implement support for ECMAScript Modules (ESM) in Node.js without causing more trouble and confusion than it would be worth.

ECMAScript modules vs. CommonJS

Because of the complexity of the issues involved, sitting down face to face with the members of TC-39 was deemed to be the most productive path forward.

The full article discusses what they found and understood from this conversation.

#8: The Node.js Developer Survey & its Results

We at Trace by RisingStack conducted a survey during 2016 Summer to find out how developers use Node.js.

The results show that MongoDB, RabbitMQ, AWS, Jenkins, Docker and Amazon Container Services are the go-to choices for developing, containerizing and shipping Node.js applications.

The results also tell Node developers major pain-point: debugging.

Node.js Survey - How do you identify issues in your app? Using logs.

You can read the full article with the Node.js survey results and graphs here.

#9: The Node.js Foundation Pledges to Manage Node Security Issues with New Collaborative Effort

The Node Foundation announced at Node.js Interactive North America that it will oversee the Node.js Security Project which was founded by Adam Baldwin and previously managed by ^Lift.

As part of the Node.js Foundation, the Node.js Security Project will provide a unified process for discovering and disclosing security vulnerabilities found in the Node.js module ecosystem. Governance for the project will come from a working group within the foundation.

The Node.js Foundation will take over the following responsibilities from ^Lift:

  • Maintaining an entry point for ecosystem vulnerability disclosure;
  • Maintaining a private communication channel for vulnerabilities to be vetted;
  • Vetting participants in the private security disclosure group;
  • Facilitating ongoing research and testing of security data;
  • Owning and publishing the base dataset of disclosures, and
  • Defining a standard for the data, which tool vendors can build on top of, and security and vendors can add data and value to as well.

You can read the full article discussing every detail on The New Stack.

#10: The Node.js Maturity Checklist

The Node.js Maturity Checklist gives you a starting point to understand how well Node.js is adopted in your company.

The checklist follows your adoption trough establishing company culture, teaching your employees, setting up your infrastructure, writing code and running the application.

You can find the full Node.js Maturity Checklist here.


JavaScript Clean Coding Best Practices

JavaScript Clean Coding Best Practices

Writing clean code is what you must know and do in order to call yourself a professional developer. There is no reasonable excuse for doing anything less than your best.

In this blog post, we will cover general clean coding principles for naming and using variables & functions, as well as some JavaScript specific clean coding best practices.

“Even bad code can function. But if the code isn’t clean, it can bring a development organization to its knees.” — Robert C. Martin (Uncle Bob)


Node.js at Scale is a collection of articles focusing on the needs of companies with bigger Node.js installations and advanced Node developers. Chapters:

First of all, what does clean coding mean?

Clean coding means that in the first place you write code for your later self and for your co-workers and not for the machine.

Your code must be easily understandable for humans.

"Write code for your later self and for your co-workers in the first place - not for the machine." via @RisingStack

Click To Tweet

You know you are working on a clean code when each routine you read turns out to be pretty much what you expected.

JavaSctipr Clean Coding: The only valid measurement of code quality is WTFs/minute

JavaScript Clean Coding Best Practices

Now that we know what every developer should aim for, let’s go through the best practices!

How should I name my variables?

Use intention-revealing names and don't worry if you have long variable names instead of saving a few keyboard strokes.

If you follow this practice, your names become searchable, which helps a lot when you do refactors or you are just looking for something.

// DON'T
let d  
let elapsed  
const ages = arr.map((i) => i.age)

// DO
let daysSinceModification  
const agesOfUsers = users.map((user) => user.age)  

Also, make meaningful distinctions and don't add extra, unnecessary nouns to the variable names, like its type (hungarian notation).

// DON'T
let nameString  
let theUsers

// DO
let name  
let users  

Make your variable names easy to pronounce, because for the human mind it takes less effort to process.

When you are doing code reviews with your fellow developers, these names are easier to reference.

// DON'T
let fName, lName  
let cntr

let full = false  
if (cart.size > 100) {  
  full = true
}

// DO
let firstName, lastName  
let counter

const MAX_CART_SIZE = 100  
// ...
const isFull = cart.size > MAX_CART_SIZE  

In short, don't cause extra mental mapping with your names.

How should I write my functions?

Your functions should do one thing only on one level of abstraction.

Functions should do one thing. They should do it well. They should do it only. — Robert C. Martin (Uncle Bob)

// DON'T
function getUserRouteHandler (req, res) {  
  const { userId } = req.params
  // inline SQL query
  knex('user')
    .where({ id: userId })
    .first()
    .then((user) => res.json(user))
}

// DO
// User model (eg. models/user.js)
const tableName = 'user'  
const User = {  
  getOne (userId) {
    return knex(tableName)
      .where({ id: userId })
      .first()
  }
}

// route handler (eg. server/routes/user/get.js)
function getUserRouteHandler (req, res) {  
  const { userId } = req.params
  User.getOne(userId)
    .then((user) => res.json(user))
}

After you wrote your functions properly, you can test how well you did with CPU profiling - which helps you to find bottlenecks.

Node.js Monitoring and Debugging from the Experts of RisingStack

Find slow functions using Trace
Learn more

Use long, descriptive names

A function name should be a verb or a verb phrase, and it needs to communicate its intent, as well as the order and intent of the arguments.

A long descriptive name is way better than a short, enigmatic name or a long descriptive comment.

// DON'T
/**
 * Invite a new user with its email address
 * @param {String} user email address
 */
function inv (user) { /* implementation */ }

// DO
function inviteUser (emailAddress) { /* implementation */ }  


Avoid long argument list

Use a single object parameter and destructuring assignment instead. It also makes handling optional parameters much easier.

// DON'T
function getRegisteredUsers (fields, include, fromDate, toDate) { /* implementation */ }  
getRegisteredUsers(['firstName', 'lastName', 'email'], ['invitedUsers'], '2016-09-26', '2016-12-13')

// DO
function getRegisteredUsers ({ fields, include, fromDate, toDate }) { /* implementation */ }  
getRegisteredUsers({  
  fields: ['firstName', 'lastName', 'email'],
  include: ['invitedUsers'],
  fromDate: '2016-09-26',
  toDate: '2016-12-13'
})

Reduce side effects

Use pure functions without side effects, whenever you can. They are really easy to use and test.

// DON'T
function addItemToCart (cart, item, quantity = 1) {  
  const alreadyInCart = cart.get(item.id) || 0
  cart.set(item.id, alreadyInCart + quantity)
  return cart
}

// DO
// not modifying the original cart
function addItemToCart (cart, item, quantity = 1) {  
  const cartCopy = new Map(cart)
  const alreadyInCart = cartCopy.get(item.id) || 0
  cartCopy.set(item.id, alreadyInCart + quantity)
  return cartCopy
}

// or by invert the method location
// you can expect that the original object will be mutated
// addItemToCart(cart, item, quantity) -> cart.addItem(item, quantity)
const cart = new Map()  
Object.assign(cart, {  
  addItem (item, quantity = 1) {
    const alreadyInCart = this.get(item.id) || 0
    this.set(item.id, alreadyInCart + quantity)
    return this
  }
})


Organize your functions in a file according to the stepdown rule

Higher level functions should be on top and lower levels below. It makes it natural to read the source code.

// DON'T
// "I need the full name for something..."
function getFullName (user) {  
  return `${user.firstName} ${user.lastName}`
}

function renderEmailTemplate (user) {  
  // "oh, here"
  const fullName = getFullName(user)
  return `Dear ${fullName}, ...`
}

// DO
function renderEmailTemplate (user) {  
  // "I need the full name of the user"
  const fullName = getFullName(user)
  return `Dear ${fullName}, ...`
}

// "I use this for the email template rendering"
function getFullName (user) {  
  return `${user.firstName} ${user.lastName}`
}


Query or modification

Functions should either do something (modify) or answer something (query), but not both.


Everyone likes to write JavaScript differently, what to do?

As JavaScript is dynamic and loosely typed, it is especially prone to programmer errors.

Use project or company wise linter rules and formatting style.

The stricter the rules, the less effort will go into pointing out bad formatting in code reviews. It should cover things like consistent naming, indentation size, whitespace placement and even semicolons.

"The stricter the linter rules, the less effort needed to point out bad formatting in code reviews." by @RisingStack

Click To Tweet

The standard JS style is quite nice to start with, but in my opinion, it isn't strict enough. I can agree most of the rules in the Airbnb style.

How to write nice async code?

Use Promises whenever you can.

Promises are natively available from Node 4. Instead of writing nested callbacks, you can have chainable Promise calls.

// AVOID
asyncFunc1((err, result1) => {  
  asyncFunc2(result1, (err, result2) => {
    asyncFunc3(result2, (err, result3) => {
      console.lor(result3)
    })
  })
})

// PREFER
asyncFuncPromise1()  
  .then(asyncFuncPromise2)
  .then(asyncFuncPromise3)
  .then((result) => console.log(result))
  .catch((err) => console.error(err))

Most of the libraries out there have both callback and promise interfaces, prefer the latter. You can even convert callback APIs to promise based one by wrapping them using packages like es6-promisify.

// AVOID
const fs = require('fs')

function readJSON (filePath, callback) {  
  fs.readFile(filePath, (err, data) => {
    if (err) {
      return callback(err)
    }

    try {
      callback(null, JSON.parse(data))
    } catch (ex) {
      callback(ex)
    }
  })
}

readJSON('./package.json', (err, pkg) => { console.log(err, pkg) })

// PREFER
const fs = require('fs')  
const promisify = require('es6-promisify')

const readFile = promisify(fs.readFile)  
function readJSON (filePath) {  
  return readFile(filePath)
    .then((data) => JSON.parse(data))
}

readJSON('./package.json')  
  .then((pkg) => console.log(pkg))
  .catch((err) => console.error(err))

The next step would be to use async/await (≥ Node 7) or generators with co (≥ Node 4) to achieve synchronous like control flows for your asynchronous code.

const request = require('request-promise-native')

function getExtractFromWikipedia (title) {  
  return request({
    uri: 'https://en.wikipedia.org/w/api.php',
    qs: {
      titles: title,
      action: 'query',
      format: 'json',
      prop: 'extracts',
      exintro: true,
      explaintext: true
    },
    method: 'GET',
    json: true
  })
    .then((body) => Object.keys(body.query.pages).map((key) => body.query.pages[key].extract))
    .then((extracts) => extracts[0])
    .catch((err) => {
      console.error('getExtractFromWikipedia() error:', err)
      throw err
    })
} 

// PREFER
async function getExtractFromWikipedia (title) {  
  let body
  try {
    body = await request({ /* same parameters as above */ })
  } catch (err) {
    console.error('getExtractFromWikipedia() error:', err)
    throw err
  }

  const extracts = Object.keys(body.query.pages).map((key) => body.query.pages[key].extract)
  return extracts[0]
}

// or
const co = require('co')

const getExtractFromWikipedia = co.wrap(function * (title) {  
  let body
  try {
    body = yield request({ /* same parameters as above */ })
  } catch (err) {
    console.error('getExtractFromWikipedia() error:', err)
    throw err
  }

  const extracts = Object.keys(body.query.pages).map((key) => body.query.pages[key].extract)
  return extracts[0]
})

getExtractFromWikipedia('Robert Cecil Martin')  
  .then((robert) => console.log(robert))


How should I write performant code?

In the first place, you should write clean code, then use profiling to find performance bottlenecks.

Never try to write performant and smart code first, instead, optimize the code when you need to and refer to true impact instead of micro-benchmarks.

"Write clean code first and optimize it when you need to. Refer to true impact instead of micro-benchmarks!"

Click To Tweet

Although, there are some straightforward scenarios like eagerly initializing what you can (eg. joi schemas in route handlers, which would be used in every request and adds serious overhead if recreated every time) and using asynchronous instead of blocking code.

Download the whole building with Node.js series as a single pdf

Next up in Node.js at Scale

In the next episode of this series, we’ll discuss advanced Node.js async best practices and avoiding the callback hell!

If you have any questions regarding clean coding, don’t hesitate and let me know in the comments!


Node.js Tutorial Videos: Debugging, Async, Memory Leaks, CPU Profiling

Node.js Tutorial Videos: Debugging, Async, Memory Leaks, CPU Profiling

At RisingStack, we're continuously working on delivering Node.js tutorials to help developers overcome their biggest obstacles, and become even better, week-by-week.

In our recent Node.js survey we've been told that Debugging, understanding/using Async programming, handling callbacks and memory leaks are amongst the greatest pain-points one would face on her/his journey to become a Node Hero.

This is why we came up with the idea of a new video tutorial series called Owning Node.js

In this three-part video series, we're going through all of these topics in a detailed way - by showing and explaining the actual coding process to you.

All of the videos are captioned, so you'll have no problem with understanding what's going on by enabling the subtitles!

So, let's start Owning Node.js together!


Node.js Debugging Made Easy

In this very first video, I'm going to show you how to use the debug module, the built-in debugger, and Chrome DevTools to find and fix issues easily!


Node.js Async Programming Done Right

In the second Node.js tutorial video, I'm going to show you how you can handle asynchronous operations easily, and how you can do performant applications in Node.js using them!

In this 3-part video series @RisingStack explains #nodejs debugging, #async, memory leaks and CPU profiling.

Click To Tweet

So, we are going to take a look at error handling with asynchronous operations, and learn how you can use the async module to handle multiple callbacks at the same time.


CPU and Memory Profiling with Node.js

In the 3rd Node.js tutorial of the series I teach you how to create CPU profiles and Memory Heapdumps, and how to analyze them in the Chrome DevTools profiler. You'll learn detecting memory leaks and bottlenecks easily.


More Node.js tutorials: Announcing the Node Hero Program

I hope these videos made things clearer! If you'd like to keep getting better, I've got good news for you!

We're launching the NODE HERO program as of today, which contains further webinars and screencasts, live-coding sessions and access to our Node.js Debugging and Monitoring solution called Trace.

I highly recommend to check it out, if you'd like to become an even better Node.js developer! See you there!


Node.js Garbage Collection Explained

Node.js Garbage Collection Explained

In this article, you are going to learn how Node.js garbage collection works, what happens in the background when you write code and how memory is freed up for you.

Ancient garbage collector in action

With Node.js at Scale we are creating a collection of articles focusing on the needs of companies with bigger Node.js installations, and developers who already learned the basics of Node.

Memory Management in Node.js Applications

Every application needs memory to work properly. Memory management provides ways to dynamically allocate memory chunks for programs when they request it, and free them when they are no longer needed - so that they can be reused.

Application-level memory management can be manual or automatic. The automatic memory management usually involves a garbage collector.

The following code snippet shows how memory can be allocated in C, using manual memory management:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {

   char name[20];
   char *description;

   strcpy(name, "RisingStack");

   // memory allocation
   description = malloc( 30 * sizeof(char) );

   if( description == NULL ) {
      fprintf(stderr, "Error - unable to allocate required memory\n");
   } else {
      strcpy( description, "Trace by RisingStack is an APM.");
   }

   printf("Company name = %s\n", name );
   printf("Description: %s\n", description );

   // release memory
   free(description);
}

In manual memory management, it is the responsibility of the developer to free up the unused memory portions. Managing your memory this way can introduce several major bugs to your applications:

  • Memory leaks when the used memory space is never freed up.
  • Wild/dangling pointers appear when an object is deleted, but the pointer is reused. Serious security issues can be introduced when other data structures are overwritten or sensitive information is read.

Luckily for you, Node.js comes with a garbage collector, and you don't need to manually manage memory allocation.

Node.js Monitoring and Debugging from the Experts of RisingStack

Watch out for your garbage collector metrics with Trace
Learn more

The Concept of the Garbage Collector

Garbage collection is a way of managing application memory automatically. The job of the garbage collector (GC) is to reclaim memory occupied by unused objects (garbage). It was first used in LISP in 1959, invented by John McCarthy.

The way how the GC knows that objects are no longer in use is that no other object has references to them.

"A garbage collector was first used in LISP in 1959, invented by John McCarthy." via @RisingStack

Click To Tweet

Memory before the garbage collection

The following diagram shows how the memory can look like if you have objects with references to each other, and with some objects that have no reference to any objects. These are the objects that can be collected by a garbage collector run.

Memory state before Node.js garbage collection

Memory after the garbage collection

Once the garbage collector is run, the objects that are unreachable gets deleted, and the memory space is freed up.

Memory state after Node.js garbage collection

The Advantages of Using a Garbage Collector

  • it prevents wild/dangling pointers bugs,
  • it won't try to free up space that was already freed up,
  • it will protect you from some types of memory leaks.

Of course, using a garbage collector doesn't solve all of your problems, and it’s not a silver bullet for memory management. Let's take a look at things that you should keep in mind!

"Using a garbage collector doesn't solve all of your memory management problems with #nodejs!" via @RisingStack

Click To Tweet


Things to Keep in Mind When Using a Garbage Collector

  • performance impact - in order to decide what can be freed up, the GC consumes computing power
  • unpredictable stalls - modern GC implementations try to avoid "stop-the-world" collections

Node.js Garbage Collection & Memory Management in Practice

The easiest way of learning is by doing - so I am going to show you what happens in the memory with different code snippets.

The Stack

The stack contains local variables and pointers to objects on the heap or pointers defining the control flow of the application.

In the following example, both a and b will be placed on the stack.

function add (a, b) {  
  return a + b
}

add(4, 5)  



Need help with enterprise-grade Node.js Development?
Hire the experts of RisingStack!


The Heap

The heap is dedicated to store reference type objects, like strings or objects.

The Car object created in the following snippet is placed on the heap.

function Car (opts) {  
  this.name = opts.name
}

const LightningMcQueen = new Car({name: 'Lightning McQueen'})  

After this, the memory would look something like this:

Node.js Garbage Collection First Step - Object Placed in the Memory Heap

Let's add more cars, and see how our memory would look like!

function Car (opts) {  
  this.name = opts.name
}

const LightningMcQueen = new Car({name: 'Lightning McQueen'})  
const SallyCarrera = new Car({name: 'Sally Carrera'})  
const Mater = new Car({name: 'Mater'})  

Node.js Garbage Collection Second Step - More elements added to the heap

If the GC would run now, nothing could be freed up, as the root has a reference to every object.

Let's make it a little bit more interesting, and add some parts to our cars!

function Engine (power) {  
  this.power = power
}

function Car (opts) {  
  this.name = opts.name
  this.engine = new Engine(opts.power)
}

let LightningMcQueen = new Car({name: 'Lightning McQueen', power: 900})  
let SallyCarrera = new Car({name: 'Sally Carrera', power: 500})  
let Mater = new Car({name: 'Mater', power: 100})  

Node.js Garbage Collection - Assigning values to the objects in the heap

What would happen, if we no longer use Mater, but redefine it and assign some other value, like Mater = undefined?

Node.js Garbage Collection - Redefining values

As a result, the original Mater object cannot be reached from the root object, so on the next garbage collector run it will be freed up:

Node.js Garbage Collection - Freeing up the unreachable object

Now as we understand the basics of what's the expected behaviour of the garbage collector, let's take a look on how it is implemented in V8!

Garbage Collection Methods

In one of our previous articles we dealt with how the Node.js garbage collection methods work, so I strongly recommend reading that article.

Here are the most important things you’ll learn there:

New Space and Old Space

The heap has two main segments, the New Space and the Old Space. The New Space is where new allocations are happening; it is fast to collect garbage here and has a size of ~1-8MBs. Objects living in the New Space are called Young Generation.

The Old Space where the objects that survived the collector in the New Space are promoted into - they are called the Old Generation. Allocation in the Old Space is fast, however collection is expensive so it is infrequently performed .

Young Generation

Usually, ~20% of the Young Generation survives into the Old Generation. Collection in the Old Space will only commence once it is getting exhausted. To do so the V8 engine uses two different collection algorithms.

Scavenge and Mark-Sweep collection

Scavenge collection is fast and runs on the Young Generation, however the slower Mark-Sweep collection runs on the Old Generation.

A Real-Life Example - The Meteor Case-Study

In 2013, the creators of Meteor announced their findings about a memory leak they ran into. The problematic code snippet was the following:

var theThing = null  
var replaceThing = function () {  
  var originalThing = theThing
  var unused = function () {
    if (originalThing)
      console.log("hi")
  }
  theThing = {
    longStr: new Array(1000000).join('*'),
    someMethod: function () {
      console.log(someMessage)
    }
  };
};
setInterval(replaceThing, 1000)  

Well, the typical way that closures are implemented is that every function object has a link to a dictionary-style object representing its lexical scope. If both functions defined inside replaceThing actually used originalThing, it would be important that they both get the same object, even if originalThing gets assigned to over and over, so both functions share the same lexical environment. Now, Chrome's V8 JavaScript engine is apparently smart enough to keep variables out of the lexical environment if they aren't used by any closures - from the Meteor blog.

Further reading:


Next up

In the next chapter of the Node.js at Scale tutorial series we will take a deep dive into writing native Node.js module.

In the meantime, let us know in the comments sections if you have any questions!


Graceful shutdown with Node.js and Kubernetes

Graceful shutdown with Node.js and Kubernetes

This article helps you to understand what graceful shutdown is, what are the main benefits of it and how can you set up the graceful shutdown of a Kubernetes application. We’ll discuss how you can validate and benchmark this process, and what are the most common mistakes that you should avoid.

Graceful shutdown

We can speak about the graceful shutdown of our application, when all of the resources it used and all of the traffic and/or data processing what it handled are closed and released properly.

It means that no database connection remains open and no ongoing request fails because we stop our application.

"Graceful shutdown: when all of the resources & data processing are closed and released properly." via @RisingStack

Click To Tweet

Possible scenarios for a graceful web server shutdown:

  1. App gets notification to stop (received SIGTERM)
  2. App lets know the load balancer that it’s not ready for newer requests
  3. App served all the ongoing requests
  4. App releases all of the resources correctly: DB, queue, etc.
  5. App exits with "success" status code (process.exit())

This article goes deep with shutting down web servers properly, but you should also apply these techniques to your worker processes: it’s highly recommended to stop consuming queues for SIGTERM and finish the current task/job.

Why is it important?

If we don't stop our application correctly, we are wasting resources like DB connections and we may also break ongoing requests. An HTTP request doesn't recover automatically - if we fail to serve it, then we simply missed it.

"If we don't stop our app correctly, we're wasting resources & may also break ongoing requests." via @RisingStack

Click To Tweet

Graceful start

We should only start our application when all of the dependencies and database connections are ready to handle our traffic.

Possible scenarios for a graceful web server start:

  1. App starts (npm start)
  2. App opens DB connections
  3. App listens on port
  4. App tells the load balancer that it’s ready for requests

Graceful shutdown in a Node.js application

First of all, you need to listen for the SIGTERM signal and catch it:

process.on('SIGTERM', function onSigterm () {  
  console.info('Got SIGTERM. Graceful shutdown start', new Date().toISOString())
  // start graceul shutdown here
  shutdown()
})

After that, you can close your server, then close your resources and exit the process:

function shutdown() {  
  server.close(function onServerClosed (err) {
    if (err) {
      console.error(err)
      process.exit(1)
    }

    closeMyResources(function onResourcesClosed (err) {
      // error handling
      process.exit()
    })
  })
}

Sounds easy right? Maybe a little bit too easy.

What about the load balancer? How will it know that your app is not ready to receive further requests anymore? What about keep-alive connections? Will they keep the server open for a longer time? What if my server SIGKILL my app in the meantime?

Graceful shutdown with Kubernetes

If you’d like to learn a little bit about Kubernetes, you can read our Moving a Node.js app from PaaS to Kubernetes Tutorial. For now, let's just focus on the shutdown now.

Kubernetes comes with a resource called Service. Its job is to route traffic to your pods (~instances of your app). Kubernetes also comes with a thing called Deployment that describes how your applications should behave during exit, scale and deploy - and you can also define a health check here. We will combine these resources for the perfect graceful shutdown and handover during new deploys at high traffic.

We would like to see throughput charts like below with consistent rpm and no deployment side effects at all:

Graceful shutdown example: Throughput time in Trace by RisingStack Throughput metrics shown in Trace - no change at deploy

Ok, let's see how to solve this challenge.

Setting up graceful shutdown

In Kubernetes, for a proper graceful shutdown we need to add a readinessProbe to our application’s Deployment yaml and let the Service’s load balancer know during the shutdown that we will not serve more requests so it should stop sending them. We can close the server, tear down the DB connections and exit only after that.

How does it work?

Kubernetes graceful shutdown flowchart

  1. pod receives SIGTERM signal because Kubernetes wants to stop it - because of deploy, scale, etc.
  2. App (pod) starts to return 500 for GET /health to let readinessProbe (Service) know that it's not ready to receive more requests.
  3. Kubernetes readinessProbe checks GET /health and after (failureThreshold * periodSecond) it stops redirecting traffic to the app (because it continuously returns 500)
  4. App waits (failureThreshold * periodSecond) before it starts to shut down - to make sure that the Service is getting notified via readinessProbe fail
  5. App starts graceful shutdown
  6. App first closes server with live working DB connections
  7. App closes databases after the server is closed
  8. App exits process
  9. Kubernetes force kills the application after 30s (SIGKILL) if it's still running (in an optimal case it doesn't happen)

In our case, the Kubernetes livenessProbe won't kill the app before the graceful shutdown happens because it needs to wait (failureThreshold * periodSecond) to do it. This means that the livenessProve threshold should be larger than the readinessProbe threshold. This way the (graceful stop happens around 4s, while the force kill would happen 30s after SIGTERM).

Node.js Monitoring and Debugging from the Experts of RisingStack

Compare different application revisions using Trace!
Learn more

How to achieve it?

For this we need to do two things, first we need to let the readinessProbe know after SIGTERM that we are not ready anymore:

'use strict'

const db = require('./db')  
const promiseTimeout = require('./promiseTimeout')  
const state = { isShutdown: false }  
const TIMEOUT_IN_MILLIS = 900

process.on('SIGTERM', function onSigterm () {  
  state.isShutdown = true
})

function get (req, res) {  
  // SIGTERM already happened
  // app is not ready to serve more requests
  if (state.isShutdown) {
    res.writeHead(500)
    return res.end('not ok')
  }

  // something cheap but tests the required resources
  // timeout because we would like to log before livenessProbe KILLS the process
  promiseTimeout(db.ping(), TIMEOUT_IN_MILLIS)
    .then(() => {
      // success health
      res.writeHead(200)
      return res.end('ok')
    })
    .catch(() => {
      // broken health
      res.writeHead(500)
      return res.end('not ok')
    })
}

module.exports = {  
  get: get
}

The second thing is that we have to delay the teardown process - as a sane default you can use the time needed for two failed readinessProbe: failureThreshold: 2 * periodSeconds: 2 = 4s

process.on('SIGTERM', function onSigterm () {  
  console.info('Got SIGTERM. Graceful shutdown start', new Date().toISOString())

  // Wait a little bit to give enough time for Kubernetes readiness probe to fail 
  // (we are not ready to serve more traffic)
  // Don't worry livenessProbe won't kill it until (failureThreshold: 3) => 30s
  setTimeout(greacefulStop, READINESS_PROBE_DELAY)
})

You can find the full example here:
https://github.com/RisingStack/kubernetes-graceful-shutdown-example

How to validate it?

Let's test our graceful shutdown by sending high traffic to our pods and releasing a new version in the meantime (recreating all of the pods).

Test case

$ ab -n 100000 -c 20 http://localhost:myport

Other than this, you need to change an environment variable in the Deployment to re-create all pods during the ab benchmarking.

AB output

Document Path:          /  
Document Length:        3 bytes

Concurrency Level:      20  
Time taken for tests:   172.476 seconds  
Complete requests:      100000  
Failed requests:        0  
Total transferred:      7800000 bytes  
HTML transferred:       300000 bytes  
Requests per second:    579.79 [#/sec] (mean)  
Time per request:       34.495 [ms] (mean)  
Time per request:       1.725 [ms] (mean, across all concurrent requests)  
Transfer rate:          44.16 [Kbytes/sec] received  

Application log output

Got SIGTERM. Graceful shutdown start 2016-10-16T18:54:59.208Z  
Request after sigterm: / 2016-10-16T18:54:59.217Z  
Request after sigterm: / 2016-10-16T18:54:59.261Z  
...
Request after sigterm: / 2016-10-16T18:55:00.064Z  
Request after sigterm: /health?type=readiness 2016-10-16T18:55:00.820Z  
HEALTH: NOT OK  
Request after sigterm: /health?type=readiness 2016-10-16T18:55:02.784Z  
HEALTH: NOT OK  
Request after sigterm: /health?type=liveness 2016-10-16T18:55:04.781Z  
HEALTH: NOT OK  
Request after sigterm: /health?type=readiness 2016-10-16T18:55:04.800Z  
HEALTH: NOT OK  
Server is shutting down... 2016-10-16T18:55:05.210Z  
Successful graceful shutdown 2016-10-16T18:55:05.212Z  

Benchmark result

Success!

Zero failed requests: you can see in the app log that the Service stopped sending traffic to the pod before we disconnected from the DB and killed the app.

Common gotchas

The following mistakes can still prevent your app from doing a proper graceful shutdown:

Keep-alive connections

Kubernetes doesn't handover keep-alive connections properly. :/

This means that request from agents with a keep-alive header will still be routed to the pod.

It tricked me first when I benchmarked with autocannon or Google Chrome (they use keep-alive connections).

Keep-alive connections prevent closing your server in time. To force the exit of a process, you can use the server-destroy module. Once it ran, you can be sure that all the ongoing requests are served. Alternatively you can adda timeout logic to your server.close(cb).

Docker signaling

It’s quite possible that your application doesn't receive the signals correctly in a dockerized application.

For example in our Alpine image: CMD ["node", "src"] works, CMD ["npm", "start"] doesn't. It simply doesn't pass the SIGTERM to the node process. The issue is probably related to this PR: https://github.com/npm/npm/pull/10868

An alternative you can use is dumb-init for fixing broken Docker signaling.

Takeaway

Always be sure that your application stops correctly: It releases all of the resources and helps to hand over the traffic to the new version of your app.

Check out our example repository with Node.js and Kubernetes:
https://github.com/RisingStack/kubernetes-graceful-shutdown-example

"An app stops correctly if it releases all resources & hands over the traffic to your new app." via @RisingStack

Click To Tweet

If you have any questions or thoughts about this topic, find me in the comment section below!


How the module system, CommonJS & require works

How the module system, CommonJS & require works

In the third chapter of Node.js at Scale you are about to learn how the Node.js module system & CommonJS works and what does require do under the hood.

With Node.js at Scale we are creating a collection of articles focusing on the needs of companies with bigger Node.js installations, and developers who already learned the basics of Node.

CommonJS to the rescue

The JavaScript language didn’t have a native way of organizing code before the ES2015 standard. Node.js filled this gap with the CommonJS module format. In this article we will learn about how the Node.js module system works, how you can organize your modules and what does the new ES standard means for the future of Node.js.

"#JavaScript didn't have a mature module system before #nodejs. That gap was filled with #commonjs" via @RisingStack

Click To Tweet

What is the module system?

Modules are the fundamental building blocks of the code structure. The module system allows you to organize your code, hide information and only expose the public interface of a component using module.exports. Every time you use the require call, you are loading another module.

The simplest example can be the following using CommonJS:

// add.js
function add (a, b) {  
  return a + b
}

module.exports = add  

To use the add module we have just created, we have to require it.

// index.js
const add = require('./add')

console.log(add(4, 5))  
//9

Under the hood, add.js is wrapped by Node.js this way:

(function (exports, require, module, __filename, __dirname) {
  function add (a, b) {
    return a + b
  }

  module.exports = add
})

This is why you can access the global-like variables like require and module. It also ensures that your variables are scoped to your module rather than the global object.

"Modules are the fundamental building blocks of the code structure." via @RisingStack #nodejs

Click To Tweet

How does require work?

The module loading mechanism in Node.js is caching the modules on the first require call. It means that every time you use require('awesome-module') you will get the same instance of awesome-module, which ensures that the modules are singleton-like and have the same state across your application.

You can load native modules and path references from your file system or installed modules. If the identifier passed to the require function is not a native module or a file reference (beginning with /, ../, ./ or similar), then Node.js will look for installed modules. It will walk your file system looking for the referenced module in the node_modules folder. It starts from the parent directory of your current module and then moves to the parent directory until it finds the right module or until the root of the file system is reached.

Node.js Monitoring and Debugging from the Experts of RisingStack

Build performant applications using Trace
Learn more
Require under the hood - module.js

The module dealing with module loading in the Node core is called module.js, and can be found in lib/module.js in the Node.js repository.

The most important functions to check here are the _load and _compile functions.

Module._load

This function checks whether the module is in the cache already - if so, it returns the exports object.

If the module is native, it calls the NativeModule.require() with the filename and returns the result.

Otherwise, it creates a new module for the file and saves it to the cache. Then it loads the file contents before returning its exports object.

Module._compile

The compile function runs the file contents in the correct scope or sandbox, as well as exposes helper variables like require, module or exports to the file.

How require works in Node.js How Require Works - From James N. Snell

How to organize the code?

In our applications, we need to find the right balance of cohesion and coupling when creating modules. The desirable scenario is to achieve high cohesion and loose coupling of the modules.

A module must be focused only on a single part of the functionality to have high cohesion. Loose coupling means that the modules should not have a global or shared state. They should only communicate by passing parameters, and they are easily replaceable without touching your broader codebase.

"The desirable scenario is to achieve high cohesion and loose coupling of the modules." via @RisingStack #nodejs

Click To Tweet

We usually export named functions or constants in the following way:

'use strict'

const CONNECTION_LIMIT = 0

function connect () { /* ... */ }

module.exports = {  
  CONNECTION_LIMIT,
  connect
}

What’s in your node_modules?

The node_modules folder is the place where Node.js looks for modules. npm v2 and npm v3 install your dependencies differently. You can find out what version of npm you are using by executing:

npm --version  

npm v2

npm 2 installs all dependencies in a nested way, where your primary package dependencies are in their node_modules folder.

npm v3

npm3 attempts to flatten these secondary dependencies and install them in the root node_modules folder. This means that you can’t tell by looking at your node_modules which packages are your explicit or implicit dependencies. It is also possible that the installation order changes your folder structure because npm 3 is non-deterministic in this manner.


You can make sure that your node_modules directory is always the same by installing packages only from a package.json. In this case, it installs your dependencies in alphabetical order, which also means that you will get the same folder tree. This is important because the modules are cached using their path as the lookup key. Each package can have its own child node_modules folder, which might result in multiple instances of the same package and of the same module.

How to handle your modules?

There are two main ways for wiring modules. One of them is using hard coded dependencies, explicitly loading one module into another using a require call. The other method is to use a dependency injection pattern, where we pass the components as a parameter or we have a global container (known as IoC, or Inversion of Control container), which centralizes the management of the modules.

We can allow Node.js to manage the modules life cycle by using hard coded module loading. It organizes your packages in an intuitive way, which makes understanding and debugging easy.

Dependency Injection is rarely used in a Node.js environment, although it is a useful concept. The DI pattern can result in an improved decoupling of the modules. Instead of explicitly defining dependencies for a module, they are received from the outside. Therefore they can be easily replaced with modules having the same interfaces.

Let’s see an example for DI modules using the factory pattern:

class Car {  
  constructor (options) {
    this.engine = options.engine
  }

  start () {
    this.engine.start()
  }
}

function create (options) {  
  return new Car(options)
}

module.exports = create  

The ES2015 module system

As we saw above, the CommonJS module system uses a runtime evaluation of the modules, wrapping them into a function before the execution. The ES2015 modules don’t need to be wrapped since the import/export bindings are created before evaluating the module. This incompatibility is the reason that currently there are no JavaScript runtime supporting the ES modules. There was a lot of discussion about the topic and a proposal is in DRAFT state, so hopefully we will have support for it in future Node versions.

To read an in-depth explanation of the biggest differences between CommonJS and the ESM, read the following article by James M Snell.

Download the whole Learn using npm series as a single pdf

Next up

I hope this article contained valuable information about the module system and how require works. If you have any questions or insights on the topic, please share them in the comments. In the next chapter of the Node.js at Scale series, we are going to take a deep dive and learn about the event loop.