Node.js Development & Consulting

Hire our experts

Try our Node.js monitoring tool

node.js tutorial

Building a Node.js App with TypeScript Tutorial

Building a Node.js App with TypeScript Tutorial

This tutorial teaches how you can build, structure, test and debug a Node.js application written in TypeScript. To do so, we use an example project which you can access anytime later.


Managing large-scale JavaScript projects can be challenging, as you need to guarantee that the pieces fit together. You can use unit tests, types (which JavaScript does not really have), or the two in combination to solve this issue.

This is where TypeScript comes into the picture. TypeScript is a typed superset of JavaScript that compiles to plain JavaScript.

In this article you will learn:

  • what TypeScript is,
  • what are the benefits of using Typescript,
  • how you can set up a project to start developing using it:
    • how to add linters,
    • how to write tests,
    • how to debug applications written in TypeScript

This article won't go into to details of using the TypeScript language itself, it focuses on how you can build Node.js applications using it. If you are looking for an in-depth TypeScript tutorial, I recommend checking out the TypeScript Gitbook.

The benefits of using TypeScript

As we already discussed, TypeScript is a superset of Javascript. It gives you the following benefits:

  • optional static typing, with emphasis on optional (it makes porting JavaScript application to TypeScript easy),
  • as a developer, you can start using ECMAScript features that are not supported by the current V8 engine by using build targets,
  • use of interfaces,
  • great tooling with instruments like IntelliSense.

Getting started with TypeScript & Node

TypeScript is a static type checker for JavaScript. This means that it will check for issues in your codebase using the information available on different types. Example: a String will have a toLowerCase() method, but not a parseInt() method. Of course, the type system of TypeScript can be extended with your own type definitions.

As TypeScript is a superset of JavaScript, you can start using it by literally just renaming your .js files to .ts, so you can introduce TypeScript gradually to your teams.

Note: TypeScript won't do anything in runtime, it works only during compilation time. You will run pure JavaScript files.


To get started with TypeScript, grab it from npm:

$ npm install -g typescript

Let's write our first TypeScript file! It will simply greet the person it gets as a parameter:

// greeter.ts
function greeter(person: string) {  
  return `Hello ${person}!`
}

const name = 'Node Hero'

console.log(greeter(name))  

One thing you could already notice is the string type annotation which tells the TypeScript compiler that the greeter function is expecting a string as its parameter.

Let's try to compile it!

tsc greeter.ts  

First, let's take a look at the compiled output! Ss you can see, there was no major change, only that the type annotations were removed:

function greeter(person) {  
    return "Hello " + person + "!";
}
var userName = 'Node Hero';  
console.log(greeter(userName));  

What would happen if you'd change the userName to a Number? As you could guess, you will get a compilation error:

greeter.ts(10,21): error TS2345: Argument of type '3' is not assignable to parameter of type 'string'.  

Tutorial: Building a Node.js app with TypeScript

1. Set up your development environment

To build applications using TypeScript, make sure you have Node.js installed on your system. This article will use Node.js 8.

We recommend installing Node.js using nvm, the Node.js version manager. With this utility application, you can have multiple Node.js versions installed on your system, and switching between them is only a command away.

# install nvm
curl -o- https://raw.githubusercontent.com/creationix/nvm/v0.33.2/install.sh | bash

# install node 8
nvm install 8

# to make node 8 the default
nvm alias default 8  

Once you have Node.js 8 installed, you should create a directory where your project will live. After that, create your package.json file using:

npm init  

2. Create the project structure

When using TypeScript, it is recommended to put all your files under an src folder.

At the end of this tutorial, we will end up with the following project structure:

Node.js TypeScript Tutorial - Example Application Project Structure

Let's start by adding the App.ts file - this will be the file where your web server logic will be implemented, using express.

In this file, we are creating a class called App, which will encapsulate our web server. It has a private method called mountRoutes, which mounts the routes served by the server. The express instance is reachable through the public express property.

import * as express from 'express'

class App {  
  public express

  constructor () {
    this.express = express()
    this.mountRoutes()
  }

  private mountRoutes (): void {
    const router = express.Router()
    router.get('/', (req, res) => {
      res.json({
        message: 'Hello World!'
      })
    })
    this.express.use('/', router)
  }
}

export default new App().express  

We are also creating an index.ts file, so the web server can be fired up:

import app from './App'

const port = process.env.PORT || 3000

app.listen(port, (err) => {  
  if (err) {
    return console.log(err)
  }

  return console.log(`server is listening on ${port}`)
})

With this - at least in theory - we have a functioning server. To actually make it work, we have to compile our TypeScript code to JavaScript.

For more information on how to structure your project, read our Node.js project structuring article.

3. Configuring TypeScript

You can pass options to the TypeScript compiler by either by using the CLI, or a special file called tsconfig.json. As we would like to use the same settings for different tasks, we will go with the tsconfig.json file.

By using this configuration file, we are telling TypeScript things like the build target (can be ES5, ES6, and ES7 at the time of this writing), what module system to expect, where to put the build JavaScript files, or whether it should create source-maps as well.

{
  "compilerOptions": {
    "target": "es6",
    "module": "commonjs",
    "outDir": "dist",
    "sourceMap": true
  },
  "files": [
    ".[email protected]/mocha/index.d.ts",
    ".[email protected]/node/index.d.ts"
  ],
  "include": [
    "src/**/*.ts"
  ],
  "exclude": [
    "node_modules"
  ]
}

Once you added this TypeScript configuration file, you can build your application using the tsc command.

If you do not want to install TypeScript globally, just add it to the dependency of your project, and create an npm script for it: "tsc": "tsc".

This will work, as npm scripts will look for the binary in the ./node_modules/.bin folder, and add it to the PATH when running scripts. Then you can access tsc using npm run tsc. Then, you can pass options to tsc using this syntax: npm run tsc -- --all (this will list all the available options for TypeScript).



Need help with enterprise-grade Node.js Development?
Hire the experts of RisingStack!


4. Add ESLint

As with most projects, you want to have linters to check for style issues in your code. TypeScript is no exception.

To use ESLint with TypeScript, you have to add an extra package, a parser, so ESLint can understand Typescript as well: typescript-eslint-parser. Once you installed it, you have to set it as the parser for ESLint:

# .eslintrc.yaml
---
  extends: airbnb-base
  env:
    node: true
    mocha: true
    es6: true
  parser: typescript-eslint-parser
  parserOptions:
    sourceType: module
    ecmaFeatures: 
      modules: true

Once you run eslint src --ext ts, you will get the same errors and warnings for your TypeScript files that you are used to:

Node.js TypeScript Tutorial - Console Errors

5. Testing your application

Testing your TypeScript-based applications is essentially the same as you would do it with any other Node.js applications.

The only gotcha is that you have to compile your application before actually running the tests on them. Achieving it is very straightforward, you can simply do it with: tsc && mocha dist/**/*.spec.js.

For more on testing, check out our Node.js testing tutorial.

6. Build a Docker image

Once you have your application ready, most probably you want to deploy it as a Docker image. The only extra steps you need to take are:

  • build the application (compile from TypeScript to JavaScript),
  • start the Node.js application from the built source.
FROM risingstack/alpine:3.4-v6.9.4-4.2.0

ENV PORT 3001

EXPOSE 3001

COPY package.json package.json  
RUN npm install

COPY . .  
RUN npm run build

CMD ["node", "dist/"]  

7. Debug using source-maps

As we enabled generating source-maps, we can use them to find bugs in our application. To start looking for issues, start your Node.js process the following way:

node --inspect dist/  

This will output something like the following:

To start debugging, open the following URL in Chrome:  
    chrome-devtools:[email protected]84980/inspector.html?experiments=true&v8only=true&ws=127.0.0.1:9229/23cd0c34-3281-49d9-81c8-8bc3e0bc353a
server is listening on 3000  

To actually start the debugging process, open up your Google Chrome browser and browse to chrome://inspect. A remote target should already be there, just click inspect. This will bring up the Chrome DevTools.

Here, you will instantly see the original source, and you can start putting breakpoints, watchers on the TypeScript source code.

Node.js TypeScript Tutorial - Debugging in Chrome

The source-map support only works with Node.js 8 and higher.

The Complete Node.js TypeScript Tutorial

You can find the complete Node.js TypeScript starter application on GitHub.

Let us know in the issues, or here in the comments what would you change!

Node.js + MySQL Example: Handling 100's of GigaBytes of Data

Node.js + MySQL Example: Handling 100's of GigaBytes of Data

Through this Node.js & MySQL example project, we will take a look at how you can efficiently handle billions of rows that take up hundreds of gigabytes of storage space.

My secondary goal with this article is to help you decide if Node.js + MySQL is a good fit for your needs, and to provide help with implementing such a solution.

The actual code we will use throughout this blogpost can be found on GitHub.

Why Node.js and MySQL?

We use MySQL to store the distributed tracing data of the users of our Node.js Monitoring & Debugging Tool called Trace.

We chose MySQL, because at the time of the decision, Postgres was not really good at updating rows, while for us, updating immutable data would have been unreasonably complex.

Unfortunately, these solutions are not ACID compliant which makes them difficult to use when data consistency is extremely important.

However, with good indexing and proper planning, MySQL can be just as suitable for the task as the above-mentioned NoSQL alternatives.

MySQL has several storage engines. InnoDB is the default one, which comes with the most features. However, one should take into account that InnoDB tables are immutable, meaning every ALTER TABLE statement will copy all the data into a new table. It will make matters worse when the need arises to migrate an already existing database.

If you have nominal values, each having a lot of associated data — e.g. each of your users have millions of products and you have tons of users — it is probably the easiest by creating tables for each of them and giving them names like <user_id>_<entity_name>. This way you can reduce the size of individual tables significantly.

Also, getting rid of a user's data in case of an account removal is an O(1) operation. This is very important, because if you need to remove large amount of values from big tables, MySQL may decide to use the wrong index or not to use indexes at all.

It does not help either that you cannot use index hints for DELETEs. You might need to ALTER your table to remove your data, but that would mean copying each row to a new table.

Creating tables for each user clearly adds complexity, but it may be a big win when it comes to removing users or similar entities with huge amount of associated data.

However, before going for dynamically created tables, you should try deleting rows in chunks as it may help as well and results in less added complexity. Of course, if you have data coming in faster than you can delete, you might get stuck with the aforementioned solution.

But what if your tables are still huge after partitioning them by users and you need to delete outdated rows as well? You still have data coming in faster than you can remove. In this case, you should try MySQL's built in table partitioning. It comes handy when you need to cut your tables by values that are defined on an ordinal or continuous scale, such as a creation timestamp.

Table partitioning with MySQL

With MySQL, a partitioned table will work as if it was multiple tables, but you can use the same interface you got used to, while no additional logic is needed from the application's side. This also means you can drop partitions as if you dropped tables.

The documentation is good, but pretty verbose as well (after all this is not a simple topic), so let's take a quick look at how you should create a partitioned table.

The way we handled our partitions was taken from Rick James's post on the topic. He also gives quite some insight on how you should plan your tables.

CREATE TABLE IF NOT EXISTS tbl (  
      id INTEGER NOT NULL AUTO_INCREMENT,
      data VARCHAR(255) NOT NULL,
      created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
      PRIMARY KEY (id, created_at)
    )

PARTITION BY RANGE (TO_DAYS(created_at)) (  
        start        VALUES LESS THAN (0),
        from20170514 VALUES LESS THAN (TO_DAYS('2017-05-15')),
        from20170515 VALUES LESS THAN (TO_DAYS('2017-05-16')),
        from20170516 VALUES LESS THAN (TO_DAYS('2017-05-17')),
        future       VALUES LESS THAN MAXVALUE
    );

It is nothing unusual until the part PARTITION BY RANGE.

In MySQL, you can partition by RANGE, LIST, COLUMN, HASH and KEY you can read about them in the documentation. Notice that the partitioning key must be part of the primary key or any unique indexes.

The ones starting with from<date> should be self-explanatory. Each partition holds values for which the created_at column is less than the date of the next day. This also means that from20120414 holds all data that are older than 2012-04-15, so this is the partition that we will drop when we perform the cleanup.

The future and start partitions need some explanation: future holds the values for the days we have not yet defined. So if we cannot run repartitioning in time, all data that arrives on 2017-05-17 or later will end up there, making sure we don't lose any of it. start serves as a safety net as well. We expect all rows to have a DATETIME created_at value, however, we need to be prepared for possible errors. If for some reason a row would end up having NULL there, it will end up in the start partition, serving as a sign that we have some debugging to do.

When you use partitioning, MySQL will keep that data on separate parts of the disk as if they were separate tables and organizes your data automatically based on the partitioning key.

There are some restrictions to be taken into account though:

  • Query cache is not supported.
  • Foreign keys are not supported for partitioned InnoDB tables.
  • Partitioned tables do not support FULLTEXT indexes or searches.

There are a lot more, but these are the ones that we felt the most constraining after adopting partitioned tables at RisingStack.

If you want to create a new partition, you need to reorganize an existing one and split it to fit your needs:

ALTER TABLE tbl  
    REORGANIZE PARTITION future INTO (
        from20170517 VALUES LESS THAN (TO_DAYS('2017-05-18')),
        from20170518 VALUES LESS THAN (TO_DAYS('2017-05-19')),
        PARTITION future VALUES LESS THAN MAXVALUE
);

Dropping partitions takes an alter table, yet it runs as if you dropped a table:

ALTER TABLE tbl  
    DROP PARTITION from20170517, from20170518;

As you can see you have to include the actual names and descriptions of the partitions in the statements. They cannot be dynamically generated by MySQL, so you have to handle it in the application logic. That's what we'll cover next.

Table partitioning example with Node.js & MySQL

Let's see the actual solution. For the examples here, we will use knex, which is a query builder for JavaScript. In case you are familiar with SQL, you shouldn't have any problem understanding the code.

First, let's create the table:

const dedent = require('dedent')  
const _ = require('lodash')  
const moment = require('moment')

const MAX_DATA_RETENTION = 7  
const PARTITION_NAME_DATE_FORMAT = 'YYYYMMDD'

Table.create = function () {  
  return knex.raw(dedent`
    CREATE TABLE IF NOT EXISTS \`${tableName}\` (
      \`id\` INTEGER NOT NULL AUTO_INCREMENT,
      \`data\` VARCHAR(255) NOT NULL,
      \`created_at\` DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
      PRIMARY KEY (\`id\`, \`created_at\`)
    )
    PARTITION BY RANGE ( TO_DAYS(\`created_at\`)) (
      PARTITION \`start\` VALUES LESS THAN (0),
      ${Table.getPartitionStrings()}
      PARTITION \`future\` VALUES LESS THAN MAXVALUE
    );
  `)
}

Table.getPartitionStrings = function () {  
  const days = _.range(MAX_DATA_RETENTION - 2, -2, -1)
  const partitions = days.map((day) => {
    const tomorrow = moment().subtract(day, 'day').format('YYYY-MM-DD')
    const today = moment().subtract(day + 1, 'day').format(PARTITION_NAME_DATE_FORMAT)
    return `PARTITION \`from${today}\` VALUES LESS THAN (TO_DAYS('${tomorrow}')),`
  })
  return partitions.join('\n')
}

It is practically the same statement we saw earlier, but we have to create the names and descriptions of partitions dynamically. That's why we created the getPartitionStrings method.

The first row is:

const days = _.range(MAX_DATA_RETENTION - 2, -2, -1)  

MAX_DATA_RETENTION - 2 = 5 creates an sequence from 5 to -2 (last value exclusive) -> [ 5, 4, 3, 2, 1, 0, -1 ], then we subtract these values from the current time and create the name of the partition (today) and its' limit (tomorrow). The order is vital as MySQL throws an error if the values to partition by do not grow constantly in the statement.

Large Scale Data Removal Example with MySQL and Node.js

Now let's take a step by step look at data removal. You can see the whole code here.

The first method, removeExpired gets the list of current partitions then passes it on to repartition.

const _ = require('lodash')

Table.removeExpired = function (dataRetention) {  
  return Table.getPartitions()
    .then((currentPartitions) => Table.repartition(dataRetention, currentPartitions))
}

Table.getPartitions = function () {  
  return knex('information_schema.partitions')
    .select(knex.raw('partition_name as name'), knex.raw('partition_description as description')) // description holds the day of partition in mysql days
    .where('table_schema', dbName)
    .andWhere('partition_name', 'not in', [ 'start', 'future' ])
    .then((partitions) => partitions.map((partition) => ({
      name: partition.name,
      description: partition.description === 'MAX_VALUE' ? 'MAX_VALUE' : parseInt(partition.description)
    })))
}

Table.repartition = function (dataRetention, currentPartitions) {  
  const partitionsThatShouldExist = Table.getPartitionsThatShouldExist(dataRetention, currentPartitions)

  const partitionsToBeCreated = _.differenceWith(partitionsThatShouldExist, currentPartitions, (a, b) => a.description === b.description)
  const partitionsToBeDropped = _.differenceWith(currentPartitions, partitionsThatShouldExist, (a, b) => a.description === b.description)

  const statement = dedent
    `${Table.reorganizeFuturePartition(partitionsToBeCreated)}
    ${Table.dropOldPartitions(partitionsToBeDropped)}`

  return knex.raw(statement)
}

First, we select all currently existing partitions from the information_schema.partitions table that is maintained by MySQL.

Then we create all the partitions that should exist for the table. If A is the set of partitions that exist and B is set of partitions that should exist then

partitionsToBeCreated = B \ A

partitionsToBeDropped = A \ B.

getPartitionsThatShouldExist creates set B.

Table.getPartitionsThatShouldExist = function (dataRetention, currentPartitions) {  
  const days = _.range(dataRetention - 2, -2, -1)
  const oldestPartition = Math.min(...currentPartitions.map((partition) => partition.description))
  return days.map((day) => {
    const tomorrow = moment().subtract(day, 'day')
    const today = moment().subtract(day + 1, 'day')
    if (Table.getMysqlDay(today) < oldestPartition) {
      return null
    }

    return {
      name: `from${today.format(PARTITION_NAME_DATE_FORMAT)}`,
      description: Table.getMysqlDay(tomorrow)
    }
  }).filter((partition) => !!partition)
}

Table.getMysqlDay = function (momentDate) {  
  return momentDate.diff(moment([ 0, 0, 1 ]), 'days') // mysql dates are counted since 0 Jan 1 00:00:00
}

The creation of partition objects is quite similar to the creation of the CREATE TABLE ... PARTITION BY RANGE statement. It is also vital to check if the partition we are about to create is older than the current oldest partition: it is possible that we need to change the dataRetention over time.

Take this scenario for example:

Imagine that your users start out with 7 days of data retention, but have an option to upgrade it to 10 days. In the beginning the user has partitions that cover days in the following order: [ start, -7, -6, -5, -4, -3, -2, -1, future ]. After a month or so, a user decides to upgrade. The missing partitions are in this case: [ -10, -9, -8, 0 ].

At cleanup, the current script would try to reorganize the future partition for the missing partitions appending them after the current ones.

Creating partitions for days older than -7 does not make sense in the first place because that data was meant to be thrown away so far anyways, and it would also lead to a partition list that looks like [ start, -7, -6, -5, -4, -3, -2, -1, -10, -9, -8, 0, future ] which isn't monotonously increasing, thus MySQL will throw an error, and the cleanup will fail.

MySQL's TO_DAYS(date) function calculates the number of days passed since year 0 January 1, so we replicate this in JavaScript.

Table.getMysqlDay = function (momentDate) {  
  return momentDate.diff(moment([ 0, 0, 1 ]), 'days')
}

Now that we have the partitions that have to be dropped, and the partitions that have to be created, let's create our new partition first for the new day.

Table.reorganizeFuturePartition = function (partitionsToBeCreated) {  
  if (!partitionsToBeCreated.length) return '' // there should be only one every day, and it is run hourly, so ideally 23 times a day it should be a noop
  const partitionsString = partitionsToBeCreated.map((partitionDescriptor) => {
    return `PARTITION \`${partitionDescriptor.name}\` VALUES LESS THAN (${partitionDescriptor.description}),`
  }).join('\n')

  return dedent`
    ALTER TABLE \`${tableName}\`
      REORGANIZE PARTITION future INTO (
        ${partitionsString}
        PARTITION \`future\` VALUES LESS THAN MAXVALUE
      );`
}

We simply prepare a statement for the new partition(s) to be created.

We run this script hourly just to make sure nothing goes astray and we are able to perform the cleanup properly at least once a day.

So the first thing to check is if there's a partition to be created at all. This should happen only at the first run, then be a noop 23 times a day.

We also have to drop the outdated partitions.

Table.dropOldPartitions = function (partitionsToBeDropped) {  
  if (!partitionsToBeDropped.length) return ''
  let statement = `ALTER TABLE \`${tableName}\`\nDROP PARTITION\n`
  statement += partitionsToBeDropped.map((partition) => {
    return partition.name
  }).join(',\n')
  return statement + ';'
}

This method creates the same ALTER TABLE ... DROP PARTITION statement we saw earlier.

And finally, everything is ready for the reorganization.

  const statement = dedent
    `${Table.reorganizeFuturePartition(partitionsToBeCreated)}
    ${Table.dropOldPartitions(partitionsToBeDropped)}`

  return knex.raw(statement)

Wrapping it up

As you can see, contrary to popular belief, ACID compliant DBMS solutions such as MySQL can be used when you are handling large amounts of data, so you don't necessarily need to give up the features of transactional databases.

However, table partitioning comes with quite a few restrictions, meaning you are cut off from using all the power InnoDB provides for keeping your data consistent. You might also have to handle in the app logic what otherwise would be available such as foreign key constraints or full-text searches.

I hope this post helps you decide whether MySQL is a good fit for your needs and helps you implement your solution. Until next time: Happy engineering!

If you have any Node + MySQL questions, let me know in the comments below!

Mastering the Node.js CLI & Command Line Options

Mastering the Node.js CLI & Command Line Options

Node.js comes with a lot of CLI options to expose built-in debugging & to modify how V8, the JavaScript engine works.

In this post, we have collected the most important CLI commands to help you become more productive.

Accessing Node.js CLI Options

To get a full list of all available Node.js CLI options in your current distribution of Node.js, you can access the manual page from the terminal using:

$ man node

Usage: node [options] [ -e script | script.js ] [arguments]  
       node debug script.js [arguments] 

Options:  
  -v, --version         print Node.js version
  -e, --eval script     evaluate script
  -p, --print           evaluate script and print result
  -c, --check           syntax check script without executing
...

As you can see in the first usage section, you have to provide the optional options before the script you want to run.

Take the following file:

console.log(new Buffer(100))  

To take advantage of the --zero-fill-buffers option, you have to run your application using:

$ node --zero-fill-buffers index.js

This way the application will produce the correct output, instead of random memory garbage:

<Buffer 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... >  

CLI Options

Now as we saw how you instruct Node.js to use CLI options, let's see what other options are there!

--version or -v

Using the node --version, or short, node -v, you can print the version of Node.js you are using.

$ node -v
v6.10.0  

--eval or -e

Using the --eval option, you can run JavaScript code right from your terminal. The modules which are predefined in REPL can also be used without requiring them, like the http or the fs module.

$ node -e 'console.log(3 + 2)'
5  

--print or -p

The --print option works the same way as the --eval, however it prints the result of the expression. To achieve the same output as the previous example, we can simply leave the console.log:

$ node -p '3 + 2'
5  

--check or -c

Available since v4.2.0

The --check option instructs Node.js to check the syntax of the provided file, without actually executing it.

Take the following example again:

console.log(new Buffer(100)  

As you can see, a closing ) is missing. Once you run this file using node index.js, it will produce the following output:

/Users/gergelyke/Development/risingstack/mastering-nodejs-cli/index.js:1
(function (exports, require, module, __filename, __dirname) { console.log(new Buffer(100)
                                                                                        ^
SyntaxError: missing ) after argument list  
    at Object.exports.runInThisContext (vm.js:76:16)
    at Module._compile (module.js:542:28)
    at Object.Module._extensions..js (module.js:579:10)
    at Module.load (module.js:487:32)
    at tryModuleLoad (module.js:446:12)
    at Function.Module._load (module.js:438:3)
    at Module.runMain (module.js:604:10)
    at run (bootstrap_node.js:394:7)

Using the --check option you can check for the same issue, without executing the script, using node --check index.js. The output will be similar, except you won't get the stack trace, as the script never ran:

/Users/gergelyke/Development/risingstack/mastering-nodejs-cli/index.js:1
(function (exports, require, module, __filename, __dirname) { console.log(new Buffer(100)
                                                                                        ^
SyntaxError: missing ) after argument list  
    at startup (bootstrap_node.js:144:11)
    at bootstrap_node.js:509:3

The --check option can come handy, when you want to see if your script is syntactically correct, without executing it.


Expert help when you need it the most

Commercial Node.js Support by RisingStack
Learn more


--inspect[=host:port]

Available since v6.3.0

Using node --inspect will activate the inspector on the provided host and port. If they are not provided, the default is 127.0.0.1:9229. The debugging tools attached to Node.js instances communicate via a tcp port using the Chrome Debugging Protocol.

--inspect-brk[=host:port]

Available since v7.6.0

The --inspect-brk has the same functionality as the --inspect option, however it pauses the execution at the first line of the user script.

$ node --inspect-brk index.js 
Debugger listening on port 9229.  
Warning: This is an experimental feature and could change at any time.  
To start debugging, open the following URL in Chrome:  
    chrome-devtools://devtools/bundled/inspector.html?experiments=true&v8only=true&ws=127.0.0.1:9229/86dd44ef-c865-479e-be4d-806d622a4813

Once you have ran this command, just copy and paste the URL you got to start debugging your Node.js process.

--zero-fill-buffers

Available since v6.0.0

Node.js can be started using the --zero-fill-buffers command line option to force all newly allocated Buffer instances to be automatically zero-filled upon creation. The reason to do so is that newly allocated Buffer instances can contain sensitive data.

It should be used when it is necessary to enforce that newly created Buffer instances cannot contain sensitive data, as it has significant impact on performance.

Also note, that some Buffer constructors got deprecated in v6.0.0:

  • new Buffer(array)
  • new Buffer(arrayBuffer[, byteOffset [, length]])
  • new Buffer(buffer)
  • new Buffer(size)
  • new Buffer(string[, encoding])

Instead, you should use Buffer.alloc(size[, fill[, encoding]]), Buffer.from(array), Buffer.from(buffer), Buffer.from(arrayBuffer[, byteOffset[, length]]) and Buffer.from(string[, encoding]).

You can read more on the security implications of the Buffer module on the Synk blog.

--prof-process

Using the --prof-process, the Node.js process will output the v8 profiler output.

To use it, first you have to run your applications using:

node --prof index.js  

Once you ran it, a new file will be placed in your working directory, with the isolate- prefix.

Then, you have to run the Node.js process with the --prof-process option:

node --prof-process isolate-0x102001600-v8.log > output.txt  

This file will contain metrics from the V8 profiler, like how much time was spent in the C++ layer, or in the JavaScript part, and which function calls took how much time. Something like this:

[C++]:
   ticks  total  nonlib   name
     16   18.4%   18.4%  node::ContextifyScript::New(v8::FunctionCallbackInfo<v8::Value> const&)
      4    4.6%    4.6%  ___mkdir_extended
      2    2.3%    2.3%  void v8::internal::String::WriteToFlat<unsigned short>(v8::internal::String*, unsigned short*, int, int)
      2    2.3%    2.3%  void v8::internal::ScavengingVisitor<(v8::internal::MarksHandling)1, (v8::internal::LoggingAndProfiling)0>::ObjectEvacuationStrategy<(v8::internal::ScavengingVisitor<(v8::internal::MarksHandling)1, (v8::internal::LoggingAndProfiling)0>::ObjectContents)1>::VisitSpecialized<24>(v8::internal::Map*, v8::internal::HeapObject**, v8::internal::HeapObject*)

[Summary]:
   ticks  total  nonlib   name
      1    1.1%    1.1%  JavaScript
     70   80.5%   80.5%  C++
      5    5.7%    5.7%  GC
      0    0.0%          Shared libraries
     16   18.4%          Unaccounted

To get a full list of Node.js CLI options, check out the official documentation here.


V8 Options

You can print all the available V8 options using the --v8-options command line option.

Currently V8 exposes more than a 100 command line options - here we just picked a few to showcase some of the functionality they can provide. Some of these options can drastically change how V8 behaves, use them with caution!

--harmony

With the harmony flag, you can enable all completed harmony features.

--max_old_space_size

With this option, you can set the maximum size of the old space on the heap, which directly affects how much memory your process can allocate.

This setting can come handy when you run in low memory environments.

--optimize_for_size

With this option, you can instruct V8 to optimize the memory space for size - even if the application gets slower.

Just as the previous option, it can be useful in low memory environments.


Environment Variables

NODE_DEBUG=module[,…]

Setting this environment variable enables the core modules to print debug information. You can run the previous example like this to get debug information on the module core component (instead of module, you can go for http, fs, etc...):

$ NODE_DEBUG=module node index.js

The output will be something like this:

MODULE 7595: looking for "/Users/gergelyke/Development/risingstack/mastering-nodejs-cli/index.js" in ["/Users/gergelyke/.node_modules","/Users/gergelyke/.node_libraries","/Users/gergelyke/.nvm/versions/node/v6.10.0/lib/node"]  
MODULE 7595: load "/Users/gergelyke/Development/risingstack/mastering-nodejs-cli/index.js" for module "."  

NODE_PATH=path

Using this setting, you can add extra paths for the Node.js process to search modules in.

OPENSSL_CONF=file

Using this environment variable, you can load an OpenSSL configuration file on startup.

For a full list of supported environment variables, check out the official Node.js docs.

Let's Contribute to the CLI related Node Core Issues!

As you can see, the CLI is a really useful tool which becomes better with each Node version!

If you'd like to contribute to its advancement, you can help by checking out the currently open issues at https://github.com/nodejs/node/labels/cli !

Node.js Command Line Interface CLI Issues

Digital Transformation with the Node.js Stack

Digital Transformation with the Node.js Stack

In this article, we explore the 9 main areas of digital transformation and show what are the benefits of implementing Node.js. At the end, we’ll lay out a Digital Transformation Roadmap to help you get started with this process.

Note, that implementing Node.js is not the goal of a digital transformation project - it is just a great tool that opens up possibilities that any organization can take advantage of.


Digital transformation is achieved by using modern technology to radically improve the performance of business processes, applications, or even whole enterprises.

One of the available technologies that enable companies to go through a major performance shift is Node.js and its ecosystem. It is a tool that grants improvement opportunities that organizations should take advantage of:

  • Increased developer productivity,
  • DevOps or NoOps practices,
  • and shipping software to production in brief time using the proxy approach,

just to mention a few.

The 9 Areas of Digital Transformation

Digital Transformation projects can improve a company in nine main areas. The following elements were identified as a result of an MIT research on digital transformation, where they interviewed 157 executives from 50 companies (typically $1 billion or more in annual sales).

#1. Understanding your Customers Better

Companies are heavily investing in systems to understand specific market segments and geographies better. They have to figure out what leads to customer happiness and customer dissatisfaction.

Many enterprises are building analytics capabilities to better understand their customers. Information derived this way can be used for data-driven decisions.

#2. Achieving Top-Line Growth

Digital transformation can also be used to enhance in-person sales conversations. Instead of paper-based presentations or slides, salespeople can use great looking, interactive presentations, like tablet-based presentations.

Understanding customers better helps enterprises to transform and improve the sales experience with more personalized sales and customer service.

#3. Building Better Customer Touch Points

Customer service can be improved tremendously with new digital services. For example, by introducing new channels for the communication. Instead of going to a local branch of a business, customers can talk to support through Twitter of Facebook.

Self-service digital tools can be developed which both save time for the customer while saving money for the company.

#4. Process Digitization

With automation companies can focus their employees on more strategic tasks, innovation or creativity rather than repetitive efforts.

#5. Worker Enablement

Virtualization of individual work (the work process is separated from the location of work) have become enablers for knowledge sharing. Information and expertise is accessible in real-time for frontline employees.

#6. Data-Driven Performance Management

With the proper analytical capabilities, decisions can be made on real data and not on assumptions.

Digital transformation is changing the way how strategic decisions are made. With new tools strategic planning sessions can include more stakeholders, not just a small group.

#7. Digitally Extended Businesses

Many companies extend their physical offerings with digital ones. Examples include:

  • news outlets augmenting their print offering with digital content,
  • FMCG companies extending to e-commerce.

#8. New Digital Businesses

Companies not just extend their current offerings with digital transformation, but also coming up with new digital products that complement the traditional ones. Examples may include connected devices, like GPS trackers that can now report activity to the cloud and provide value to the customers through recommendations.

#9. Digital Globalization

Global shared services, like shared finance or HR enable organizations to build truly global operations.


The Digital Transformation Benefits of Implementing Node.js

Organizations are looking for the most effective way of digital transformation - among these companies, Node.js is becoming the de facto technology for building out digital capabilities.

Node.js is a JavaScript runtime built on Chrome's V8 JavaScript engine. Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient.

In other words: Node.js offers you the possibility to write servers using JavaScript with an incredible performance.

Increased Developer Productivity with Same-Language Stacks

When PayPal started using Node.js, they reported an 2x increase in productivity compared to the previous Java stack. How is that even possible?

NPM, the Node.js package manager, has an incredible amount of modules that can be used instantly. This saves a lot of development effort for the development team.

Secondly, as Node.js applications are written using JavaScript, front-end developers can also easily understand what's going on and make necessary changes.

This saves you valuable time again as developers will use the same language on the entire stack.

100% Business Availability, even with Extreme Load

Around 1.5 billion dollars are being spent online in the US on a single day on Black Friday, each year.

It is crucial that your site can keep up with the traffic. This is why Walmart, one of the biggest retailers is using Node.js to serve 500 million pageviews on Black Friday, without a hitch.

Fast Apps = Satisfied Customers

As your velocity increases because of the productivity gains, you can ship features/products sooner. Products that run faster result in better user experience.

Kissmetric's study showed that 40% of people abandon a website that takes more than 3 seconds to load, and 47% of consumers expect a web page to load in 2 seconds or less.

To read more on the benefits of using Node.js, you can download our Node.js is Enterprise Ready ebook.


Your Digital Transformation Roadmap with Node.js

As with most new technologies introduced to a company, it’s worth taking baby-steps first with Node.js as well. As a short framework for introducing Node.js, we recommend the following steps:

  • building a Node.js core team,
  • picking a small part of the application to be rewritten/extended using Node.js,
  • extending the scope of the project to the whole organization.

Step 1 - Building your Node.js Core Team

The core Node.js team will consist of people with JavaScript experience for both the backend and the frontend. It’s not crucial that the backend engineers have any Node.js experience, the important aspect is the vision they bring to the team.

Introducing Node.js is not just about JavaScript - it has to include members of the operations team as well, joining the core team.

The introduction of Node.js to an organization does not stop at excelling Node.js - it also means adding modern DevOps or NoOps practices, including but not limited to continuous integration and delivery.

Step 2 - Embracing The Proxy Approach

To incrementally replace old systems or to extend their functionality easily, your team can use the proxy approach.

For the features or functions you want to replace, create a small and simple Node.js application and proxy some of your load to the newly built Node.js application. This proxy does not necessarily have to be written in Node.js. With this approach, you can easily benefit from modularized, service-oriented architecture.

Another way to use proxies is to write them in Node.js and make them to talk with the legacy systems. This way you have the option to optimize the data sent being sent. PayPal was one of the first adopter of Node.js at scale, and they started with this proxy approach as well.

The biggest advantages of these solutions are that you can put Node.js into production in a short amount of time, measure your results, and learn from them.

Step 3 - Measure Node.js, Be Data-Driven

For the successful introduction of Node.js during a digital transformation project, it is crucial to set up a series of benchmarks to compare the results between the legacy system and the new Node.js applications. These data points can be response times, throughput or memory and CPU usage.

Orchestrating The Node.js Stack

As mentioned previously, introducing Node.js does not stop at excelling Node.js itself, but introducing continuous integration and delivery are crucial points as well.

Also, from an operations point of view, it is important to add containers to ship applications with confidence.

For orchestration, to operate the containers containing the Node.js applications we encourage companies to adopt Kubernetes, an open-source system for automating deployment, scaling, and management of containerized applications.

RisingStack and Digital Transformation with Node.js

RisingStack enables amazing companies to succeed with Node.js and related technologies to stay ahead of the competition. We provide professional Node.js development and consulting services from the early days of Node.js, and help companies like Lufthansa or Cisco to thrive with this technology.

10 Best Practices for Writing Node.js REST APIs

10 Best Practices for Writing Node.js REST APIs

In this article we cover best practices for writing Node.js REST APIs, including topics like naming your routes, authentication, black-box testing & using proper cache headers for these resources.

One of the most popular use-cases for Node.js is to write RESTful APIs using it. Still, while we help our customers to find issues in their applications with Trace, our Node.js monitoring tool we constantly experience that developers have a lot of problems with REST APIs.

I hope these best-practices we use at RisingStack can help:

#1 - Use HTTP Methods & API Routes

Imagine, that you are building a Node.js RESTful API for creating, updating, retrieving or deleting users. For these operations HTTP already has the adequate toolset: POST, PUT, GET, PATCH or DELETE.

As a best practice, your API routes should always use nouns as resource identifiers. Speaking of the user's resources, the routing can look like this:

  • POST /user or PUT /user:/id to create a new user,
  • GET /user to retrieve a list of users,
  • GET /user/:id to retrieve a user,
  • PATCH /user/:id to modify an existing user record,
  • DELETE /user/:id to remove a user.

#2 - Use HTTP Status Codes Correctly

If something goes wrong while serving a request, you must set the correct status code for that in the response:

  • 2xx, if everything was okay,
  • 3xx, if the resource was moved,
  • 4xx, if the request cannot be fulfilled because of a client error (like requesting a resource that does not exist),
  • 5xx, if something went wrong on the API side (like an exception happened).

If you are using Express, setting the status code is as easy as res.status(500).send({error: 'Internal server error happened'}). Similarly with Restify: res.status(201).

For a full list, check the list of HTTP status codes

#3 - Use HTTP headers to Send Metadata

To attach metadata about the payload you are about to send, use HTTP headers. Headers like this can be information on:

  • pagination,
  • rate limiting,
  • or authentication.

A list of standardized HTTP headers can be found here.

If you need to set any custom metadata in your headers, it was a best practice to prefix them with X. For example, if you were using CSRF tokens, it was a common (but non-standard) way to name them X-Csrf-Token. However with RFC 6648 they got deprecated. New APIs should make their best effort to not use header names that can conflict with other applications. For example, OpenStack prefixes its headers with OpenStack:

OpenStack-Identity-Account-ID  
OpenStack-Networking-Host-Name  
OpenStack-Object-Storage-Policy  

Note that the HTTP standard does not define any size limit on the headers; however, Node.js (as of writing this article) imposes an 80KB size limit on the headers object for practical reasons.

" Don't allow the total size of the HTTP headers (including the status line) to exceed HTTP_MAX_HEADER_SIZE. This check is here to protect embedders against denial-of-service attacks where the attacker feeds us a never-ending header that the embedder keeps buffering."

From the Node.js HTTP parser

#4 - Pick the right framework for your Node.js REST API

It is important to pick the framework that suits your use-case the most.

Express, Koa or Hapi

Express, Koa and Hapi can be used to create browser applications, and as such, they support templating and rendering - just to name a few features. If your application needs to provide the user-facing side as well, it makes sense to go for them.

Restify

On the other hand, Restify is focusing on helping you build REST services. It exists to let you build "strict" API services that are maintainable and observable. Restify also comes with automatic DTrace support for all your handlers.

Restify is used in production in major applications like npm or Netflix.

#5 - Black-Box Test your Node.js REST APIs

One of the best ways to test your REST APIs is to treat them as black boxes.

Black-box testing is a method of testing where the functionality of an application is examined without the knowledge of its internal structures or workings. So none of the dependencies are mocked or stubbed, but the system is tested as a whole.

One of the modules that can help you with black-box testing Node.js REST APIs is supertest.

A simple test case which checks if a user is returned using the test runner mocha can be implemented like this:

const request = require('supertest')

describe('GET /user/:id', function() {  
  it('returns a user', function() {
    // newer mocha versions accepts promises as well
    return request(app)
      .get('/user')
      .set('Accept', 'application/json')
      .expect(200, {
        id: '1',
        name: 'John Math'
      }, done)
  })
})

You may ask: how does the data gets populated into the database which serves the REST API?

In general, it is a good approach to write your tests in a way that they make as few assumptions about the state of the system as possible. Still, in some scenarios you can find yourself in a spot when you need to know what is the state of the system exactly, so you can make assertions and achieve higher test coverage.

So based on your needs, you can populate the database with test data in one of the following ways:

  • run your black-box test scenarios on a known subset of production data,
  • populate the database with crafted data before the test cases are run.

Of course, black-box testing does not mean that you don't have to do unit testing, you still have to write unit tests for your APIs.


Node.js Monitoring and Debugging from the Experts of RisingStack

Improve your REST APIs with Trace
Learn more


#6 - Do JWT-Based, Stateless Authentication

As your REST APIs must be stateless, so does your authentication layer. For this, JWT (JSON Web Token) is ideal.

JWT consists of three parts:

  • Header, containing the type of the token and the hashing algorithm
  • Payload, containing the claims
  • Signature (JWT does not encrypt the payload, just signs it!)

Adding JWT-based authentication to your application is very straightforward:

const koa = require('koa')  
const jwt = require('koa-jwt')

const app = koa()

app.use(jwt({  
  secret: 'very-secret' 
}))

// Protected middleware
app.use(function *(){  
  // content of the token will be available on this.state.user
  this.body = {
    secret: '42'
  }
})

After that, the API endpoints are protected with JWT. To access the protected endpoints, you have to provide the token in the Authorization header field.

curl --header "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiYWRtaW4iOnRydWV9.TJVA95OrM7E2cBab30RMHrHDcEfxjoYZgeFONFh7HgQ" my-website.com  

One thing that you could notice is that the JWT module does not depend on any database layer. This is the case because all JWT tokens can be verified on their own, and they can also contain time to live values.

Also, you always have to make sure that all your API endpoints are only accessible through a secure connection using HTTPS.

In a previous article, we explained web authentication methods in details - I recommend to check it out!

#7 - Use Conditional Requests

Conditional requests are HTTP requests which are executed differently depending on specific HTTP headers. You can think of these headers as preconditions: if they are met, the requests will be executed in a different way.

These headers try to check whether a version of a resource stored on the server matches a given version of the same resource. Because of this reason, these headers can be:

  • the timestamp of the last modification,
  • or an entity tag, which differs for each version.

These headers are:

  • Last-Modified (to indicate when the resource was last modified),
  • Etag (to indicate the entity tag),
  • If-Modified-Since (used with the Last-Modified header),
  • If-None-Match (used with the Etag header),

Let's take a look at an example!

The client below did not have any previous versions of the doc resource, so neither the If-Modified-Since, nor the If-None-Match header was applied when the resource was sent. Then, the server responds with the Etag and Last-Modified headers properly set.

Node.js RESTfu API with conditional request, without previous versions

From the MDN Conditional request documentation

The client can set the If-Modified-Since and If-None-Match headers once it tries to request the same resource - since it has a version now. If the response would be the same, the server simply responds with the 304 - Not Modified status and does not send the resource again.

Node.js RESTfu API with conditional request, with previous versions

From the MDN Conditional request documentation

#8 - Embrace Rate Limiting

Rate limiting is used to control how many requests a given consumer can send to the API.

To tell your API users how many requests they have left, set the following headers:

  • X-Rate-Limit-Limit, the number of requests allowed in a given time interval
  • X-Rate-Limit-Remaining, the number of requests remaining in the same interval,
  • X-Rate-Limit-Reset, the time when the rate limit will be reset.

Most HTTP frameworks support it out of the box (or with plugins). For example, if you are using Koa, there is the koa-ratelimit package.

Note, that the time window can vary based on different API providers - for example, GitHub uses an hour for that, while Twitter 15 minutes.

#9 - Create a Proper API Documentation

You write APIs so others can use them, benefit from them. Providing an API documentation for your Node.js REST APIs are crucial.

The following open-source projects can help you with creating documentation for your APIs:

Alternatively, if you want to use a hosted products, you can go for Apiary.

#10 - Don't Miss The Future of APIs

In the past years, two major query languages for APIs arose - namely GraphQL from Facebook and Falcor from Netflix. But why do we even need them?

Imagine the following RESTful resource request:

/org/1/space/2/docs/1/collaborators?include=email&page=1&limit=10

This can get out of hand quite easily - as you'd like to get the same response format for all your models all the time. This is where GraphQL and Falcor can help.

About GraphQL

GraphQL is a query language for APIs and a runtime for fulfilling those queries with your existing data. GraphQL provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools. - Read more here.

About Falcor

Falcor is the innovative data platform that powers the Netflix UIs. Falcor allows you to model all your backend data as a single Virtual JSON object on your Node server. On the client, you work with your remote JSON object using familiar JavaScript operations like get, set, and call. If you know your data, you know your API. - Read more here.

Amazing REST APIs for Inspiration

If you are about to start developing a Node.js REST API or creating a new version of an older one, we have collected four real-life examples that are worth checking out:

I hope that now you have a better understanding of how APIs should be written using Node.js. Please let me know in the comments if you miss anything!

Concurrency and Parallelism: Understanding I/O

Concurrency and Parallelism: Understanding I/O

With this article, we are launching a series of posts targeting developers who want to learn or refresh their knowledge about writing concurrent applications in general. The series will focus on well-known and widely adopted concurrency patterns in different programming languages, platforms, and runtimes.

In the first episode of this series, we’ll start from the ground up: Operating systems handle our applications' I/O, so it’s essential to understand the principles.


Concurrent code has a bad reputation of being notoriously easy to screw up. One of the world's most infamous software disasters was caused by a race condition. A programmer error in the Therac-25 radiation therapy device resulted in the death of four people.

Data races are not the only problem, though: inefficient locking, starvation, and a myriad of other problems rise. I remember from university that even the seemingly trivial, innocent-looking task of writing a thread-safe singleton proved to be quite challenging because of these nuances.

No wonder that throughout the past decades' many concurrency-related patterns emerged to abstract away the complexity and snip the possibilities of errors. Some have arisen as a straightforward consequence of the properties of an application area, like event loops and dispatchers in window managers, GUI toolkits, and browsers; where others succeeded in creating more general approaches applicable to a broad range of use cases, like Erlang's actor system.

My experience is that after a brief learning period, most developers can write highly concurrent, good quality code in Node.js, which is also free from race conditions. Although nothing is stopping us from creating data races, this is far less frequently happening than in programming languages or platforms that expose threads, locks and shared memory as their main concurrency abstraction. I think it's mainly due to the more functional style of creating a data flow (e.g. promises) instead of imperatively synchronizing (e.g. with locks) concurrent computations.

However to reason about the "whats and whys," it is best to start from the ground up, which I think is the OS level. It's the OS that does the hard work of scheduling our applications and interleaving it with I/O, so it is essential that we understand the principles. Then we discuss concurrency primitives and patterns and finally arrive at frameworks.

Let the journey begin!

Intro to Concurrency and Parallelism

Before diving into the OS level details, let's take a second clarifying what is concurrency exactly.

What's the difference between concurrency and parallelism?

Concurrency is much broader, general problem than parallelism. If you have tasks having inputs and outputs, and you want to schedule them so that they produce correct results, you are solving a concurrency problem.

Take a look at this diagram:

Cuncurrency & Paralellism: Diagram of Tasks with Dependencies

It shows a data flow with input and output dependencies. Here tasks 2, 3, 4 can run concurrently after 1. There is no specific order between them, so we have multiple alternatives for running it sequentially. Showing only two of them:

Concurrent data flow path: 12345 Concurrent data flow path: 13245

Alternatively, these tasks can run in parallel, e.g. on another processor core, another processor, or an entirely separate computer.

On these diagrams, thread means a computation carried out on dedicated processor core, not an OS thread, as they are not necessarily parallel. How else could you run a multithreaded web server with dedicated threads for hundreds of connections?

Paralellized Task Diagram

It's not rocket science, but what I wanted to show on these diagrams is that running concurrent tasks in parallel can reduce the overall computation time. The results will remain correct as long as the partial order shown on the above data flow graph is correct. However running if we only have one thread, the different orders are apparently equivalent, at least regarding the overall time.

If we only have one processor, why do we even bother with writing concurrent applications? The processing time will not get shorter, and we add the overhead of scheduling. As a matter of fact, any modern operating system will also slice up the concurrent tasks and interleave them, so each of the slices will run for a short time.

There are various reasons for this.

  • We, humans like to interact with the computer in real time, e.g. as I type this text, I want to see it appearing on the screen immediately, at the same time listening to my favorite tracklist, and getting notifications about my incoming emails. Just imagine that you cannot drag a window while the movie keeps on playing in it.

  • Not all operations are carried out on the computer's CPU. If you want to write to an HDD for example, a lot of time is spent seeking to the position, writing the sectors, etc., and the intermittent time can be spent to do something else. The same applies to virtually every I/O, even computations carried out on the GPU.

These require the operating system kernel to run tasks in an interleaved manner, referred to as time-sharing. This is a very important property of modern operating systems. Let's see the basics of it.

Processes and threads

A process - quite unsurprisingly - is a running instance of a computer program. It is what you see in the task manager of your operating system or top.

A process consists of allocated memory which holds the program code, its data, a heap for dynamic memory allocations, and a lot more. However ,it is not the unit for multi-tasking in desktop operating systems.

Thread is the default unit - the task - of CPU usage. Code executed in a single thread is what we usually refer to as sequential or synchronous execution.

Threads are supported by nearly all operating systems (hence the multithreaded qualifier) and can be created with system calls. They have their own call stacks, virtual CPU and (often) local storage but share the application's heap, data, codebase and resources (such as file handles) with the other threads in the same process.

They also serve as the unit of scheduling in the kernel. For this reason, we call them kernel threads, clarifying that they are native to the operating system and scheduled by the kernel, which distinguishes them from user-space threads, also called green threads, which are scheduled by some user space scheduler such as a library or VM.

Kernel Processes and Threads

Most desktop and server operating system kernels use preemptive schedulers, as does the Linux, macOS and Windows kernel. We can assume that threads are preemptively scheduled, distinguishing them from their non-preemptive (cooperative) counterparts, called fibers. This preemptive scheduling is the reason that a hanging process doesn't stall the whole computer.

The hanging time slices are interleaved with other processes' and the OS' code, so the system as a whole remains responsive.

preemption is the act of temporarily interrupting a task being carried out by a computer system, without requiring its cooperation, and with the intention of resuming the task at a later time” - Wikipedia

Context switching (switching between threads) is done at frequent intervals by the kernel, creating the illusion that our programs are running in parallel, whereas in reality, they are running concurrently but sequentially in short slices. Multi-core processors arrived pretty late to commodity: funny that Intel's first dual-core processor was released in 2005, while multitasking OSes had already been in wide use for at least 20 years.

CPU vs. I/O

Programs usually don't only consist of numeric, arithmetic and logic computations, in fact, a lot of times they merely write something to the file system, do network requests or access peripheries such as the console or an external device.

While the first kind of workload is CPU intensive, the latter requires performing I/O in the majority of the time.

CPU bound I/O bound
scientific computation reading from / writing to disk
(in-memory) data analysis accessing camera, microphone, other devices
simulations reading from / writing to network sockets
reading from stdin

Doing I/O is a kernel space operation, initiated with a system call, so it results in a privilege context switch.

When an I/O operation is requested with a blocking system call, we are talking about blocking I/O.

This can deteriorate concurrency under implementations, concretely those that use many-to-one mapping. This means that all threads in a process share a common kernel thread, which implies that every thread is blocked when one does blocking I/O (because of the above-mentioned switch to kernel mode).

No wonder that modern OSes don't do this. Instead, they use one-to-one mapping, i.e. map a kernel thread to each user-space thread, allowing another thread to run when one makes a blocking system call, which means that they are unaffected by the above adverse effect.

I/O flavors: Blocking vs. non-blocking, sync vs. async

Doing I/O usually consists of two distinct steps:

  • checking the device:

    • blocking: waiting for the device to be ready, or
    • non-blocking: e.g. polling periodically until ready, then
  • transmitting:

    • synchronous: executing the operation (e.g. read or write) initiated by the program, or
    • asynchronous: executing the operation as response to an event from the kernel (asynchronous / event driven)

You can mix the two steps in every fashion. I skip delving into to technical details which I don't possess, instead, let me just draw an analogy.

Recently I moved to a new flat, so that's where the analogy comes from. Imagine that you have to pack your things and transfer them to your new apartment. This is how it is done with different types of I/O:


Illustration for synchronous blocking I/O

Synchronous, blocking I/O: Start to move right away, possibly got blocked by traffic on the road. For multiple turns, you are required to repeat the first two steps.


Illustration for synchronous non-blocking I/O

Synchronous, non-blocking I/O: Periodically check the road for traffic, only move stuff when clear. Between the checks you can do anything else you want, rather than wasting your time on the road being blocked by others. For multiple turns, you are required to repeat the first three steps.


Illustration for asynchronous non-blocking I/O

Asynchronous, non-blocking I/O: Hire a moving company. They will ask you periodically if there's anything left to move, then you give them some of your belongings. Between their interruptions, you can do whatever you want. Finally, they notify you when they are done.


Which model suits you the best depends on your application, the complexity you dare to tackle, your OS's support, etc.

Synchronous, blocking I/O has wide support with long established POSIX interfaces and is the most widely understood and easy to use. Its drawback is that you have to rely on thread-based concurrency, which is sometimes undesirable:

  • every thread allocated uses up resources
  • more and more context switching will happen between them
  • the OS has a maximum number of threads.

That's why modern web servers shifted to the async non-blocking model, and advocate using a single-threaded event loop for the network interface to maximize the throughput. Because currently, the underlying OS APIs are platform-specific and quite challenging to use, there are a couple of libraries providing an abstraction layer over it. You can check the end of the article for the list later.

If you want to know more about the details of different I/O models, read this detailed article about boosting performance using asynchronous IO!

Busy-waiting, polling and the event loop

Busy-waiting is the act of repeatedly checking a resource, such as I/O for availability in a tight loop. The absence of the tight loop is what distinguishes polling from busy-waiting.

It's better shown than said:

// tight-loop example
while(pthread_mutex_trylock(&my_mutex) == EBUSY) { }  
// mutex is unlocked
do_stuff();  
// polling example
while(pthread_mutex_trylock(&my_mutex) == EBUSY) {  
  sleep(POLL_INTERVAL);
}
// mutex is unlocked
do_stuff();  

The difference between the two code is apparent. The sleep function puts the current thread of execution to sleep, yielding control to the kernel to schedule something else to run.

It is also obvious that both of them offer a technique of turning non-blocking code into blocking code, because control won't pass the loop until the mutex becomes free. This means that do_stuff is blocked.

Let's say we have more of these mutexes or any arbitrary I/O device that can be polled. We can invert control-flow by assigning handlers to be called when the resource is ready. If we periodically check the resources in the loop and execute the associated handlers on completion, we created what is called an event loop.

pending_event_t *pendings;  
completed_event_t *completeds;  
struct timespec start, end;  
size_t completed_ev_size, pending_ev_size, i;  
long loop_quantum_us;  
long wait_us;

// do while we have pending events that are not yet completed
while (pending_events_size) {  
  clock_gettime(CLOCK_MONOTONIC, &start);
  // check whether they are completed already
  for (i = 0; i < pending_events_size; ++i) {
    poll(&pendings, &pending_ev_size, &completeds, &completed_ev_size);
  }
  // handle completed events, the handlers might add more pending events
  for (i = 0; i < completeds_size; ++i) {
    handle(&completeds, &completed_ev_size, &pendings, &pending_ev_size);
  }
  // sleep for a while to avoid busy waiting
  clock_gettime(CLOCK_MONOTONIC, &end);
  wait_us = (end.tv_sec - start.tv_sec) * 1e6 + (end.tv_nsec - start.tv_nsec) / 1e3 - loop_quantum_us;
  if (wait_us > 0) {
    usleep(wait_us * 1e3);
  }
}

This kind of control inversion takes some time getting used to. Different frameworks expose various levels of abstractions over it. Some only provide an API for polling events, while others use a more opinionated mechanism like an event loop or a state machine.

TCP server example

The following example will illustrate the differences between working with synchronous, blocking and asynchronous, non-blocking network I/O. It is a dead-simple TCP echo server. After the client connects, every line is echoed back to the socket until the client writes "bye".

Single threaded

The first version uses the standard POSIX procedures of sys/socket.h. The server is single-threaded, it waits until a client connects

/*  Wait for a connection, then accept() it  */
if ((conn_s = accept(list_s, NULL, NULL)) < 0) { /* exit w err */ }  

Then it reads from the socket each line and echoes it back until the client closes connection or prints the word "bye" on a line:

bye = 0;

// read from socket and echo back until client says 'bye'
while (!bye) {  
    read_line_from_socket(conn_s, buffer, MAX_LINE - 1);
    if (!strncmp(buffer, "bye\n", MAX_LINE - 1)) bye = 1;
    write_line_to_socket(conn_s, buffer, strlen(buffer));
}

if (close(conn_s) < 0) { /* exit w err */ }  

animation showing the single threaded server

As you can see on the gif, this server is not concurrent at all. It can handle only one client at a time. If another client connects, it has to wait until the preceding one closes the connection.

Multi-threaded

Introducing concurrency without replacing the synchronous blocking networking API calls is done with threads. This is shown in the second version. The only difference between this and the single-threaded version is that here we create a thread for each of the connections.

A real-life server would use thread pools of course.

/*  Wait for a connection, then accept() it  */
if ((conn_s = accept(list_s, NULL, NULL)) < 0) { /* exit w err */ }  
args = malloc(sizeof(int));  
memcpy(args, &conn_s, sizeof(int));  
pthread_create(&thrd, NULL, &handle_socket, args);  

animation showing the multi threaded server

This finally enables us to serve multiple clients at the same time. Hurray!

Single threaded, concurrent

Another way to create a concurrent server is to use libuv. It exposes asynchronous non-blocking I/O calls and an event loop. Although by using it, our code will be coupled to this library, I still find it better than using obscure, platform-dependent APIs. The implementation is still quite complex.

Once we initialized our tcp server, we register a listener handle_socket for incoming connections.

uv_listen((uv_stream_t*) &tcp, SOMAXCONN, handle_socket);  

In that handler, we can accept the socket and register a reader for incoming chunks.

uv_accept(server, (uv_stream_t*) client);  
uv_read_start((uv_stream_t*) client, handle_alloc, handle_read);  

Whenever a chunk is ready and there is data, we register a write handler handle_write that echoes the data back to the socket.

uv_write(write_req, client, &write_ctx->buf, 1, handle_write);  

Else if the client said bye, or we reached EOF, we close the connection. You can see that to program this way is very tedious and error-prone (I definitely made some bugs myself, although I copied a large portion of it). Data created in one function often has to be available somewhere in its continuation (a handler created in the function, but asynchronously called later), which requires manual memory management. I advise you against using libuv directly, unless you are well acquainted in C programming.

animation showing the single threaded uv-server

Next episode: Concurrency patterns, futures, promises and so on..

We've seen how to achieve concurrency in the lowest levels of programming. Take your time to play with the examples. Also, feel free to check out this list I prepared for you:

  • Boost.Asio
    • C++
    • network and low-level I/O.
    • Boost Software License
  • Seastar
    • C++
    • network and filesystem I/O, multi-core support, fibers. Used by the ScyllaDB project.
    • APL 2.0
  • libuv
    • C
    • network and filesystem I/O, threading and synchronization primitives. Used by Node.js.
    • MIT
  • Netty
  • mio
    • Rust
    • network I/O. It is used the high-level tokio and rotor networking libraries.
    • MIT
  • Twisted
    • Python
    • network I/O
    • MIT

In the next chapter, we continue with some good ol' concurrency patterns and new ones as well. We will see how to use futures and promises for threads and continuations and will also talk about the reactor and proactor design patterns.

If you have any comments or questions about this topic, please let me know in the comment section below.

Yarn vs npm - The State of Node.js Package Managers

Yarn vs npm - The State of Node.js Package Managers

With the v7.4 release, npm 4 became the bundled, default package manager for Node.js. In the meantime, Facebook released their own package manager solution, called Yarn.

Let's take a look at the state of Node.js package managers, what they can do for you, and when you should pick which one!

Yarn - the new kid on the block

Fast, reliable and secure dependency management - this is the promise of Yarn, the new dependency manager created by the engineers of Facebook.

But can Yarn live up to the expectations?

Yarn - the node.js package manager

Installing Yarn

There are several ways of installing Yarn. If you have npm installed, you can just install Yarn with npm:

npm install yarn --global  

However, the recommended way by the Yarn team is to install it via your native OS package manager - if you are on a Mac, probably it will be brew:

brew update  
brew install yarn  

Yarn Under the Hood

Yarn has a lot of performance and security improvements under the hood. Let's see what these are!

Offline cache

When you install a package using Yarn (using yarn add packagename), it places the package on your disk. During the next install, this package will be used instead of sending an HTTP request to get the tarball from the registry.

Your cached module will be put into ~/.yarn-cache, and will be prefixed with the registry name, and postfixed with the modules version.

This means that if you install the 4.4.5 version of express with Yarn, it will be put into ~/.yarn-cache/npm-express-4.4.5.

Node.js Monitoring and Debugging from the Experts of RisingStack

Create performant applications using Trace
Learn more

Deterministic Installs

Yarn uses lockfiles (yarn.lock) and a deterministic install algorithm. We can say goodbye to the "but it works on my machine" bugs.

The lockfile looks like something like this:

Yarn lockfile

It contains the exact version numbers of all your dependencies - just like with an npm shrinkwrap file.

License checks

Yarn comes with a handy license checker, which can become really powerful in case you have to check the licenses of all the modules you depend on.

yarn licenses

Potential issues/questions

Yarn is still in its early days, so it’s no surprise that there are some questions arising when you start using it.

What’s going on with the default registry?

By default, the Yarn CLI uses a different registry, and not the original one: https://registry.yarnpkg.com. So far there is no explanation on why it does not use the same registry.

Does Facebook have plans to make incompatible API changes and split the community?

Contributing back to npm?

One the most logical questions that can come up when talking about Yarn is: Why don’t you talk with the CLI team at npm, and work together?

If the problem is speed, I am sure all npm users would like to get those improvements as well.

When we talk about deterministic installs, instead of coming up with a lockfile, the npm-shrinkwrap.json should have been fixed.

Why the strange versioning?

In the world of Node.js and npm, versions starts with 1.0.0.

At the time of writing this article, Yarn is at 0.18.1.

Is something missing to make Yarn stable? Does Yarn simply not follow semver?

npm 4

npm is the default package manager we all know, and it is bundled with each Node.js release since v7.4.

Updating npm

To start using npm version 4, you just have to update your current CLI version:

npm install npm -g  

At the time of writing this article, this command will install npm version 4.1.1, which was released on 12/11/2016. Let's see what changed in this version!

Changes since version 3

  • npm search is now reimplemented to stream results, and sorting is no longer supported,
  • npm scripts no longer prepend the path of the node executable used to run npm before running scripts,
  • prepublish has been deprecated - you should use prepare from now on,
  • npm outdated returns 1 if it finds outdated packages,
  • partial shrinkwraps are no longer supported - the npm-shrinkwrap.json is considered a complete manifest,
  • Node.js 0.10 and 0.12 are no longer supported,
  • npm doctor, which diagnose user's environment and let the user know some recommended solutions if they potentially have any problems related to npm

As you can see, the team at npm was quite busy as well - both npm and Yarn made great progress in the past months.

Conclusion

It is great to see a new, open-source npm client - no doubt, a lot of effort went into making Yarn great!

Hopefully, we will see the improvements of Yarn incorporated into npm as well, so both users will benefit from the improvements of the others.

Yarn vs. npm - Which one to pick?

If you are working on proprietary software, it does not really matter which one you use. With npm, you can use npm-shrinkwrap.js, while you can use yarn.lock with Yarn.

The team at Yarn published a great article on why lockfiles should be committed all the time, I recommend checking it out: https://yarnpkg.com/blog/2016/11/24/lockfiles-for-all


Node.js Interview Questions and Answers (2017 Edition)

Node.js Interview Questions and Answers (2017 Edition)

Two years ago we published our first article on common Node.js Interview Questions and Answers. Since then a lot of things improved in the JavaScript and Node.js ecosystem, so it was time to update it.

Important Disclaimers

It is never a good practice to judge someone just by questions like these, but these can give you an overview of the person's experience in Node.js.

But obviously, these questions do not give you the big picture of someone's mindset and thinking.

I think that a real-life problem can show a lot more of a candidate's knowledge - so we encourage you to do pair programming with the developers you are going to hire.

Finally and most importantly: we are all humans, so make your hiring process as welcoming as possible. These questions are not meant to be used as "Questions & Answers" but just to drive the conversation.

Node.js Interview Questions for 2017

  • What is an error-first callback?
  • How can you avoid callback hells?
  • What are Promises?
  • What tools can be used to assure consistent style? Why is it important?
  • When should you npm and when yarn?
  • What's a stub? Name a use case!
  • What's a test pyramid? Give an example!
  • What's your favorite HTTP framework and why?
  • How can you secure your HTTP cookies against XSS attacks?
  • How can you make sure your dependencies are safe?

The Answers

What is an error-first callback?

Error-first callbacks are used to pass errors and data as well. You have to pass the error as the first parameter, and it has to be checked to see if something went wrong. Additional arguments are used to pass data.

fs.readFile(filePath, function(err, data) {  
  if (err) {
    // handle the error, the return is important here
    // so execution stops here
    return console.log(err)
  }
  // use the data object
  console.log(data)
})

How can you avoid callback hells?

There are lots of ways to solve the issue of callback hells:

What are Promises?

Promises are a concurrency primitive, first described in the 80s. Now they are part of most modern programming languages to make your life easier. Promises can help you better handle async operations.

An example can be the following snippet, which after 100ms prints out the result string to the standard output. Also, note the catch, which can be used for error handling. Promises are chainable.

new Promise((resolve, reject) => {  
  setTimeout(() => {
    resolve('result')
  }, 100)
})
  .then(console.log)
  .catch(console.error)

What tools can be used to assure consistent style? Why is it important?

When working in a team, consistent style is important, so team members can modify more projects easily, without having to get used to a new style each time.

Also, it can help eliminate programming issues using static analysis.

Tools that can help:

If you’d like to be even more confident, I suggest you to learn and embrace the JavaScript Clean Coding principles as well!

Node.js Monitoring and Debugging from the Experts of RisingStack

Create performant applications using Trace
Learn more

What's a stub? Name a use case!

Stubs are functions/programs that simulate the behaviors of components/modules. Stubs provide canned answers to function calls made during test cases.

An example can be writing a file, without actually doing so.

var fs = require('fs')

var writeFileStub = sinon.stub(fs, 'writeFile', function (path, data, cb) {  
  return cb(null)
})

expect(writeFileStub).to.be.called  
writeFileStub.restore()  

What's a test pyramid? Give an example!

A test pyramid describes the ratio of how many unit tests, integration tests and end-to-end test you should write.

An example for an HTTP API may look like this:

  • lots of low-level unit tests for models (dependencies are stubbed),
  • fewer integration tests, where you check how your models interact with each other (dependencies are not stubbed),
  • less end-to-end tests, where you call your actual endpoints (dependencies are not stubbed).

What's your favorite HTTP framework and why?

There is no right answer for this. The goal here is to understand how deeply one knows the framework she/he uses. Tell what are the pros and cons of picking that framework.

When are background/worker processes useful? How can you handle worker tasks?

Worker processes are extremely useful if you'd like to do data processing in the background, like sending out emails or processing images.

There are lots of options for this like RabbitMQ or Kafka.

How can you secure your HTTP cookies against XSS attacks?

XSS occurs when the attacker injects executable JavaScript code into the HTML response.

To mitigate these attacks, you have to set flags on the set-cookie HTTP header:

  • HttpOnly - this attribute is used to help prevent attacks such as cross-site scripting since it does not allow the cookie to be accessed via JavaScript.
  • secure - this attribute tells the browser to only send the cookie if the request is being sent over HTTPS.

So it would look something like this: Set-Cookie: sid=<cookie-value>; HttpOnly. If you are using Express, with express-cookie session, it is working by default.

How can you make sure your dependencies are safe?

When writing Node.js applications, ending up with hundreds or even thousands of dependencies can easily happen.
For example, if you depend on Express, you depend on 27 other modules directly, and of course on those dependencies' as well, so manually checking all of them is not an option!

The only option is to automate the update / security audit of your dependencies. For that there are free and paid options:

Node.js Interview Puzzles

The following part of the article is useful if you’d like to prepare for an interview that involves puzzles, or tricky questions.

What's wrong with the code snippet?

new Promise((resolve, reject) => {  
  throw new Error('error')
}).then(console.log)

The Solution

As there is no catch after the then. This way the error will be a silent one, there will be no indication of an error thrown.

To fix it, you can do the following:

new Promise((resolve, reject) => {  
  throw new Error('error')
}).then(console.log).catch(console.error)

If you have to debug a huge codebase, and you don't know which Promise can potentially hide an issue, you can use the unhandledRejection hook. It will print out all unhandled Promise rejections.

process.on('unhandledRejection', (err) => {  
  console.log(err)
})

What's wrong with the following code snippet?

function checkApiKey (apiKeyFromDb, apiKeyReceived) {  
  if (apiKeyFromDb === apiKeyReceived) {
    return true
  }
  return false
}

The Solution

When you compare security credentials it is crucial that you don't leak any information, so you have to make sure that you compare them in fixed time. If you fail to do so, your application will be vulnerable to timing attacks.

But why does it work like that?

V8, the JavaScript engine used by Node.js, tries to optimize the code you run from a performance point of view. It starts comparing the strings character by character, and once a mismatch is found, it stops the comparison operation. So the longer the attacker has right from the password, the more time it takes.

To solve this issue, you can use the npm module called cryptiles.

function checkApiKey (apiKeyFromDb, apiKeyReceived) {  
  return cryptiles.fixedTimeComparison(apiKeyFromDb, apiKeyReceived)
}

What's the output of following code snippet?

Promise.resolve(1)  
  .then((x) => x + 1)
  .then((x) => { throw new Error('My Error') })
  .catch(() => 1)
  .then((x) => x + 1)
  .then((x) => console.log(x))
  .catch(console.error)

The Answer

The short answer is 2 - however with this question I'd recommend asking the candidates to explain what will happen line-by-line to understand how they think. It should be something like this:

  1. A new Promise is created, that will resolve to 1.
  2. The resolved value is incremented with 1 (so it is 2 now), and returned instantly.
  3. The resolved value is discarded, and an error is thrown.
  4. The error is discarded, and a new value (1) is returned.
  5. The execution did not stop after the catch, but before the exception was handled, it continued, and a new, incremented value (2) is returned.
  6. The value is printed to the standard output.
  7. This line won't run, as there was no exception.

A day may work better than questions

Spending at least half a day with your possible next hire is worth more than a thousand of these questions.

Once you do that, you will better understand if the candidate is a good cultural fit for the company and has the right skill set for the job.

Do you miss anything? Let us know!

What was the craziest interview question you had to answer? What's your favorite question / puzzle to ask? Let us know in the comments! :)


Node.js Best Practices - How to Become a Better Developer in 2017

Node.js Best Practices - How to Become a Better Developer in 2017

A year ago we wrote a post on How to Become a Better Node.js Developer in 2016 which was a huge success - so we thought now it is time to revisit the topics and prepare for 2017!

In this article, we will go through the most important Node.js best practices for 2017, topics that you should care about and educate yourself in. Let’s start!

Node.js Best Practices for 2017

Use ES2015

Last year we advised you to use ES2015 - however, a lot has changed since.

Back then, Node.js v4 was the LTS version, and it had support for 57% of the ES2015 functionality. A year passed and ES2015 support grew to 99% with Node v6.

If you are on the latest Node.js LTS version you don't need babel anymore to use the whole feature set of ES2015. But even with this said, on the client side you’ll probably still need it!

For more information on which Node.js version supports which ES2015 features, I'd recommend checking out node.green.

Use Promises

Promises are a concurrency primitive, first described in the 80s. Now they are part of most modern programming languages to make your life easier.

Imagine the following example code that reads a file, parses it, and prints the name of the package. Using callbacks, it would look something like this:

fs.readFile('./package.json', 'utf-8', function (err, data) {  
  if (err) {
    return console.log(err)
  }

  try {
    JSON.parse(data)
  } catch (ex) {
    return console.log(ex)
  }
  console.log(data.name)
})

Wouldn't it be nice to rewrite the snippet into something more readable? Promises help you with that:

fs.readFileAsync('./package.json').then(JSON.parse).then((data) => {  
  console.log(data.name)
})
.catch((e) => {
  console.error('error reading/parsing file', e)
})

Of course, for now, the fs API does not have an readFileAsync that returns a Promise. To make it work, you have to wrap it with a module like promisifyAll.

Use the JavaScript Standard Style

When it comes to code style, it is crucial to have a company-wide standard, so when you have to change projects, you can be productive starting from day zero, without having to worry about building the build because of different presets.

At RisingStack we have incorporated the JavaScript Standard Style in all of our projects.

Node.js best practices - The Standard JS Logo

With Standard, there is no decisions to make, no .eslintrc, .jshintrc, or .jscsrc files to manage. It just works. You can find the Standard rules here.



Need help with enterprise-grade Node.js Development?
Hire the experts of RisingStack!


Use Docker - Containers are Production Ready in 2017!

You can think of Docker images as deployment artifacts - Docker containers wrap up a piece of software in a complete filesystem that contains everything it needs to run: code, runtime, system tools, system libraries – anything you can install on a server.

But why should you start using Docker?

  • it enables you to run your applications in isolation,
  • as a conscience, it makes your deployments more secure,
  • Docker images are lightweight,
  • they enable immutable deployments,
  • and with them, you can mirror production environments locally.

To get started with Docker, head over to the official getting started tutorial. Also, for orchestration we recommend checking out our Kubernetes best practices article.

Monitor your Applications

If something breaks in your Node.js application, you should be the first one to know about it, not your customers.

One of the newer open-source solutions is Prometheus that can help you achieve this. Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud. The only downside of Prometheus is that you have to set it up for you and host it for yourself.

If you are looking for on out-of-the-box solution with support, Trace by RisingStack is a great solution developed by us.

Trace will help you with

  • alerting,
  • memory and CPU profiling in production systems,
  • distributed tracing and error searching,
  • performance monitoring,
  • and keeping your npm packages secure!

Node.js Best Practices for 2017 - Use Trace and Profling

Use Messaging for Background Processes

If you are using HTTP for sending messages, then whenever the receiving party is down, all your messages are lost. However, if you pick a persistent transport layer, like a message queue to send messages, you won't have this problem.

If the receiving service is down, the messages will be kept, and can be processed later. If the service is not down, but there is an issue, processing can be retried, so no data gets lost.

An example: you'd like to send out thousands of emails. In this case, you would just have to put some basic information like the target email address and the first name, and a background worker could easily put together the email's content and send them out.

What's really great about this approach is that you can scale it whenever you want, and no traffic will be lost. If you see that there are millions of emails to be sent out, you can add extra workers, and they can consume the very same queue.

You have lots of options for messaging queues:

Use the Latest LTS Node.js version

To get the best of the two worlds (stability and new features) we recommend using the latest LTS (long-term support) version of Node.js. As of writing this article, it is version 6.9.2.

To easily switch Node.js version, you can use nvm. Once you installed it, switching to LTS takes only two commands:

nvm install 6.9.2  
nvm use 6.9.2  


Use Semantic Versioning

We conducted a Node.js Developer Survey a few months ago, which allowed us to get some insights on how people use semantic versioning.

Unfortunately, we found out that only 71% of our respondents uses semantic versioning when publishing/consuming modules. This number should be higher in our opinion - everyone should use it! Why? Because updating packages without semver can easily break Node.js apps.

Node.js Best Practices for 2017 - Semantic versioning survey results

Versioning your application / modules is critical - your consumers must know if a new version of a module is published and what needs to be done on their side to get the new version.

This is where semantic versioning comes into the picture. Given a version number MAJOR.MINOR.PATCH, increment the:

  • MAJOR version when you make incompatible API changes,
  • MINOR version when you add functionality (without breaking the API), and
  • PATCH version when you make backwards-compatible bug fixes.

npm also uses SemVer when installing your dependencies, so when you publish modules, always make sure to respect it. Otherwise, you can break others applications!

Secure Your Applications

Securing your users and customers data should be one of your top priorities in 2017. In 2016 alone, hundreds of millions of user accounts were compromised as a result of low security.

To get started with Node.js Security, read our Node.js Security Checklist, which covers topics like:

  • Security HTTP Headers,
  • Brute Force Protection,
  • Session Management,
  • Insecure Dependencies,
  • or Data Validation.

After you’ve embraced the basics, check out my Node Interactive talk on Surviving Web Security with Node.js!

Learn Serverless

Serverless started with the introduction of AWS Lambda. Since then it is growing fast, with a blooming open-source community.

In the next years, serverless will become a major factor for building new applications. If you'd like to stay on the edge, you should start learning it today.

One of the most popular solutions is the Serverless Framework, which helps in deploying AWS Lambda functions.

Attend and Speak at Conferences and Meetups

Attending conferences and meetups are great ways to learn about new trends, use-cases or best practices. Also, it is a great forum to meet new people.

To take it one step forward, I'd like to encourage you to speak at one of these events as well!

As public speaking is tough, and “imagine everyone's naked” is the worst advice, I'd recommend checking out speaking.io for tips on public speaking!

Become a better Node.js developer in 2017

As 2017 will be the year of Node.js, we’d like to help you getting the most out of it!

We just launched a new study program called "Owning Node.js" which helps you to become confident in:

  • Async Programming with Node.js
  • Creating servers with Express
  • Using Databases with Node
  • Project Structuring and building scalable apps

I want to learn more !


If you have any questions about the article, find me in the comments section!

The 10 Most Important Node.js Articles of 2016

The 10 Most Important Node.js Articles of 2016

2016 was an exciting year for Node.js developers. I mean - just take a look at this picture:

Every Industry has adopted Node.js

Looking back through the 6-year-long history of Node.js, we can tell that our favorite framework has finally matured to be used by the greatest enterprises, from all around the world, in basically every industry.

Another great news is that Node.js is the biggest open source platform ever - with 15 million+ downloads/month and more than a billion package downloads/week. Contributions have risen to the top as well since now we have more than 1,100 developers who built Node.js into the platform it is now.

To summarize this year, we collected the 10 most important articles we recommend to read. These include the biggest scandals, events, and improvements surrounding Node.js in 2016.

Let's get started!

#1: How one developer broke Node, Babel and thousands of projects in 11 lines of JavaScript

Programmers were shocked looking at broken builds and failed installations after Azer Koçulu unpublished more than 250 of his modules from NPM in March 2016 - breaking thousands of modules, including Node and Babel.

Koçulu deleted his code because one of his modules was called Kik - same as the instant messaging app - so the lawyers of Kik claimed brand infringement, and then NPM took the module away from him.

"This situation made me realize that NPM is someone’s private land where corporate is more powerful than the people, and I do open source because Power To The People." - Azer Koçulu

One of Azer's modules was left-pad, which padded out the lefthand-side of strings with zeroes or spaces. Unfortunately, 1000s of modules depended on it..

You can read the rest of this story in The Register's great article, with updates on the outcome of this event.

#2: Facebook partners with Google to launch a new JavaScript package manager

In October 2016, Facebook & Google launched Yarn, a new package manager for JavaScript.

The reason? There were a couple of fundamental problems with npm for Facebooks’s workflow.

  • At Facebook’s scale npm didn’t quite work well.
  • npm slowed down the company’s continuous integration workflow.
  • Checking all of the modules into a repository was also inefficient.
  • npm is, by design, nondeterministic — yet Facebook’s engineers needed a consistent and reliable system for their DevOps workflow.

Instead of hacking around npm’s limitations, Facebook wrote Yarn from the scratch:

  • Yarn does a better job at caching files locally.
  • Yarn is also able to parallelize some of its operations, which speeds up the install process for new modules.
  • Yarn uses lockfiles and a deterministic install algorithm to create consistent file structures across machines.
  • For security reasons, Yarn does not allow developers who write packages to execute other code that’s needed as part of the install process.

Yarn, which promises to even give developers that don’t work at Facebook’s scale a major performance boost, still uses the npm registry and is essentially a drop-in replacement for the npm client.

You can read the full article with the details on TechCrunch.

#3: Debugging Node.js with Chrome DevTools

New support for Node.js debuggability landed in Node.js master in May.

To use the new debugging tool, you have to

  • nvm install node
  • Run Node with the inspect flag: node --inspect index.js
  • Open the provided URL you got, starting with “chrome-devtools://..”

Read the great tutorial from Paul Irish to get all the features and details right!

#4: How I built an app with 500,000 users in 5 days on a $100 server

Jonathan Zarra, the creator of GoChat for Pokémon GO reached 1 million users in 5 days. Zarra had a hard time paying for the servers (around $4,000 / month) that were necessary to host 1M active users.

He never thought to get this many users. He built this app as an MVP, caring about scalability later. He built it to fail.

Zarra was already talking to VCs to grow and monetize his app. Since he built the app as an MVP, he thought he can care about scalability later.

He was wrong.

Thanks to it's poor design, GoChat was unable to scale to this much users, and went down. A lot of users lost with a lot of money spent.

500,000 users in 5 days on $100/month server

Erik Duindam, the CTO of Unboxd has been designing and building web platforms for hundreds of millions of active users throughout his whole life.

Frustrated by the poor design and sad fate of Zarra's GoChat, Erik decided to build his own solution, GoSnaps: The Instagram/Snapchat for Pokémon GO.

Erik was able to build a scalable MVP with Node.js in 24 hours, which could easily handle 500k unique users.

The whole setup ran on one medium Google Cloud server of $100/month, plus (cheap) Google Cloud Storage for the storage of images - and it was still able to perform exceptionally well.

GoSnap - The Node.js MVP that can Scale

How did he do it? Well, you can read the full story for the technical details:


#5: Getting Started with Node.js - The Node Hero Tutorial Series

The aim of the Node Hero tutorial series is to help novice developers to get started with Node.js and deliver software products with it!

Node Hero - Getting started with Node.js

You can find the full table of contents below:

  1. Getting started with Node.js
  2. Using NPM
  3. Understanding async programming
  4. Your first Node.js HTTP server
  5. Node.js database tutorial
  6. Node.js request module tutorial
  7. Node.js project structure tutorial
  8. Node.js authentication using Passport.js
  9. Node.js unit testing tutorial
  10. Debugging Node.js applications
  11. Node.js Security Tutorial
  12. Deploying Node.js application to a PaaS
  13. Monitoring Node.js Applications

#6: Using RabbitMQ & AMQP for Distributed Work Queues in Node.js

This tutorial helps you to use RabbitMQ to coordinate work between work producers and work consumers.

Unlike Redis, RabbitMQ's sole purpose is to provide a reliable and scalable messaging solution with many features that are not present or hard to implement in Redis.

RabbitMQ is a server that runs locally, or in some node on the network. The clients can be work producers, work consumers or both, and they will talk to the server using a protocol named Advanced Messaging Queueing Protocol (AMQP).

You can read the full tutorial here.

#7: Node.js, TC-39, and Modules

James M Snell, IBM Technical Lead for Node.js attended his first TC-39 meeting in late September.

The reason?

One of the newer JavaScript language features defined by TC-39 — namely, Modules — has been causing the Node.js core team a bit of trouble.

James and Bradley Farias (@bradleymeck) have been trying to figure out how to best implement support for ECMAScript Modules (ESM) in Node.js without causing more trouble and confusion than it would be worth.

ECMAScript modules vs. CommonJS

Because of the complexity of the issues involved, sitting down face to face with the members of TC-39 was deemed to be the most productive path forward.

The full article discusses what they found and understood from this conversation.

#8: The Node.js Developer Survey & its Results

We at Trace by RisingStack conducted a survey during 2016 Summer to find out how developers use Node.js.

The results show that MongoDB, RabbitMQ, AWS, Jenkins, Docker and Amazon Container Services are the go-to choices for developing, containerizing and shipping Node.js applications.

The results also tell Node developers major pain-point: debugging.

Node.js Survey - How do you identify issues in your app? Using logs.

You can read the full article with the Node.js survey results and graphs here.

#9: The Node.js Foundation Pledges to Manage Node Security Issues with New Collaborative Effort

The Node Foundation announced at Node.js Interactive North America that it will oversee the Node.js Security Project which was founded by Adam Baldwin and previously managed by ^Lift.

As part of the Node.js Foundation, the Node.js Security Project will provide a unified process for discovering and disclosing security vulnerabilities found in the Node.js module ecosystem. Governance for the project will come from a working group within the foundation.

The Node.js Foundation will take over the following responsibilities from ^Lift:

  • Maintaining an entry point for ecosystem vulnerability disclosure;
  • Maintaining a private communication channel for vulnerabilities to be vetted;
  • Vetting participants in the private security disclosure group;
  • Facilitating ongoing research and testing of security data;
  • Owning and publishing the base dataset of disclosures, and
  • Defining a standard for the data, which tool vendors can build on top of, and security and vendors can add data and value to as well.

You can read the full article discussing every detail on The New Stack.

#10: The Node.js Maturity Checklist

The Node.js Maturity Checklist gives you a starting point to understand how well Node.js is adopted in your company.

The checklist follows your adoption trough establishing company culture, teaching your employees, setting up your infrastructure, writing code and running the application.

You can find the full Node.js Maturity Checklist here.