Node.js at Scale - npm Best Practices

Node Hero was a Node.js tutorial series with focusing on teaching the most essential Node.js best practices, so one can start developing applications using it.

With our new series, called Node.js at Scale, we are creating a collection of articles focusing on the needs of companies with bigger Node.js installations, and developers who already learned the basics of Node.

In the first chapter of Node.js at Scale you are going to learn the best practices on using npm as well as tips and tricks that can save you a lot of time on a daily basis.

Upcoming chapters for the Node.js at Scale series:

  • Using npm
    • npm Tricks and Best Practices (you are reading it now)
    • SemVer and Module Publishing
    • Dependency Management
  • Node.js Internals Deep Dive
    • The Event Loop
    • Garbage Collection
    • Writing Native Modules
  • Building
    • Structuring Node.js Applications
    • Clean Code
    • Handling Async
    • Event sourcing
    • Command Query Responsibility Segregation
  • Testing
    • Unit testing
    • End-to-end testing
  • Node.js in Production
    • Monitoring Node.js Applications
    • Debugging Node.js Applications
    • Profiling Node.js Applications
  • Microservices
    • Request Signing
    • Distributed Tracing
    • API Gateways

npm Best Practices

npm install is the most common way of using the npm cli - but it has a lot more to offer! In this chapter of Node.js at Scale you will learn how npm can help you during the full lifecycle of your application - from starting a new project through development and deployment.

#0 Know your npm

Before diving into the topics, let's see some commands that help you with what version of npm you are running, or what commands are available.

npm versions

To get the version of the npm cli you are actively using, you can do the following:

$ npm --version
2.13.2  

npm can return a lot more than just its own version - it can return the version of the current package, the Node.js version you are using and OpenSSL or V8 versions:

$ npm version
{ bleak: '1.0.4',
  npm: '2.15.0',
  ares: '1.10.1-DEV',
  http_parser: '2.5.2',
  icu: '56.1',
  modules: '46',
  node: '4.4.2',
  openssl: '1.0.2g',
  uv: '1.8.0',
  v8: '4.5.103.35',
  zlib: '1.2.8' }

npm help

As most cli toolkits, npm has a great built-in help functionality as well. Description and synopsis are always available. These are essentially man-pages.

$ npm help test
NAME  
       npm-test - Test a package

SYNOPSIS  
           npm test [-- <args>]

           aliases: t, tst

DESCRIPTION  
       This runs a package's "test" script, if one was provided.

       To run tests as a condition of installation, set the npat config to true.

"9 npm best practices - a must-read collection for #nodejs developers" via @RisingStack

Click To Tweet

#1 Start new projects with npm init

When starting a new project npm init can help you a lot by interactively creating a package.json file. This will prompt questions for example on the project's name or description. However, there is a quicker solution!

$ npm init --yes

If you use npm init --yes, it won't prompt for anything, just create a package.json with your defaults. To set these defaults, you can use the following commands:

npm config set init.author.name YOUR_NAME  
npm config set init.author.email YOUR_EMAIL  



Node.js Debugging Made Easy - check out Trace by RisingStack!

#2 Finding npm packages

Finding the right packages can be quite challenging - there are hundreds of thousands of modules you can choose from. We know this from experience, and developers participating in our latest Node.js survey also told us that selecting the right npm package is frustrating. Let's try to pick a module that helps us sending HTTP requests!

One website that makes the task a lot easier is npms.io. It shows metrics like quality, popularity and maintenance. These are calculated based on whether a module has outdated dependencies, does it have linters configured, is it covered with tests or when the most recent commit was made.

finding npm packages

#3 Investigate npm packages

Once we picked our module (which will be the request module in our example), we should take a look at the documentation, and check out the open issues to get a better picture of what we are going to require into our application. Don’t forget that the more npm packages you use, the higher the risk of having a vulnerable or malicious one. If you’d like to read more on npm-related security risks, read our related guideline.

If you'd like to open the homepage of the module from the cli you can do:

$ npm home request

To check open issues or the publicly available roadmap (if there’s any), you can try this:

$ npm bugs request

Alternatively, if you'd just like to check a module's git repository, type this:

$ npm repo request

#4 Saving dependencies

Once you found the package you want to include in your project, you have to install and save it. The most common way of doing that is by using npm install request.

If you'd like to take that one step forward and automatically add it to your package.json file, you can do:

$ npm install request --save

npm will save your dependencies with the ^ prefix by default. It means that during the next npm install the latest module without a major version bump will be installed. To change this behaviour, you can:

$ npm config set save-prefix='~'

In case you'd like to save the exact version, you can try:

$ npm config set save-exact true

#5 Lock down dependencies

Even if you save modules with exact version numbers as shown in the previous section, you should be aware that most npm module authors don't. It’s totally fine, they do it to get patches and features automatically.

The situation can easily become problematic for production deployments: It’s possible to have different versions locally then on production, if in the meantime someone just released a new version. The problem will arise, when this new version has some bug which will affect your production system.

To solve this issue, you may want to use npm shrinkwrap. It will generate an npm-shrinkwrap.json that contains not just the exact versions of the modules installed on your machine, but also the version of its dependencies, and so on. Once you have this file in place, npm install will use it to reproduce the same dependency tree.

#6 Check for outdated dependencies

To check for outdated dependencies, npm comes with a built-in tool method the npm outdated command. You have to run in the project's directory which you'd like to check.

$ npm outdated
conventional-changelog    0.5.3   0.5.3   1.1.0  @risingstack/docker-node  
eslint-config-standard    4.4.0   4.4.0   6.0.1  @risingstack/docker-node  
eslint-plugin-standard    1.3.1   1.3.1   2.0.0  @risingstack/docker-node  
rimraf                    2.5.1   2.5.1   2.5.4  @risingstack/docker-node  

Once you maintain more projects, it can become an overwhelming task to keep all your dependencies up to date in each of your projects. To automate this task, you can use Greenkeeper which will automatically send pull requests to your repositories once a dependency is updated.

#7 No devDepenendencies in production

Development dependencies are called development dependencies for a reason - you don't have to install them in production. It makes your deployment artifacts smaller and more secure, as you will have less modules in production which can have security problems.

To install production dependencies only, run this:

$ npm install --production

Alternatively, you can set the NODE_ENV environment variable to production:

$ NODE_ENV=production npm install

"Don't install development dependencies in production" via @RisingStack #nodejs

Click To Tweet

#8 Secure your projects and tokens

In case of using npm with a logged in user, your npm token will be placed in the .npmrc file. As a lot of developers store dotfiles on GitHub, sometimes these tokens get published by accident. Currently, there are thousands of results when searching for the .npmrc file on GitHub, with a huge percentage containing tokens. If you have dotfiles in your repositories, double check that your credentials are not pushed!

Another source of possible security issues are the files which are published to npm by accident. By default npm respects the .gitignore file, and files matching those rules won't be published. However, if you add an .npmignore file, it will override the content of .gitignore - so they won't be merged.

#9 Developing packages

When developing packages locally, you usually want to try them out with one of your projects before publish to npm. This is where npm link comes to the rescue.

What npm link does is that it creates a symlink in the global folder that links to the package where the npm link was executed.

You can run npm link package-name from another location, to create a symbolic link from the globally installed package-name to the /node_modules directory of the current folder.

Let's see it in action!

# create a symlink to the global folder
/projects/request $ npm link

# link request to the current node_modules
/projects/my-server $ npm link request

# after running this project, the require('request') 
# will include the module from projects/request

"Use npm link to test packages locally" via @RisingStack #nodejs

Click To Tweet

Next up on Node.js at Scale: SemVer and Module Publishing

The next article in the Node.js at Scale series will be a SemVer deep dive with how to publish modules.

Let me know if you have any questions in the comments!

Case Study: Finding a Node.js Memory Leak in Ghost

At RisingStack, we have been using Ghost from the very beginning, and we love it! As of today we have more than 125 blogposts, with thousands of unique visitors every day, and with 1.5 million pageviews in 2016 overall.

In this post I’m going to share the story of how we discovered a node.js memory leak in [email protected], and what role Trace played in the process of detecting and fixing it.

What's Ghost?

Just a blogging platform

Node.js Memory Leak - The Ghost blogging platforms logo

Ghost is a fully open-source publishing platform written entirely in JavaScript. It uses Node.js for the backend, Ember.js for the admin side and Handlebars.js to power the rendering.

Ghost is actively developed - in the last 30 days, it had 10 authors with 66 commits to the master branch. The project's roadmap can be found here: https://trello.com/b/EceUgtCL/ghost-roadmap.

You can open an account at https://ghost.org/ and start writing instantly - or you can host your own version of Ghost, just like we do.

Our Ghost Deployment

Firstly, I'd like to give you a quick overview of how we deploy and use Ghost in production at RisingStack. We use Ghost as a npm module, required into a bigger project, something like this:

// adding Trace to monitor the blog
require([email protected]/trace')  
const path = require('path')  
const ghost = require('ghost')

ghost({  
  config: path.join(__dirname, 'config.js')
}).then(function (ghostServer) {
  ghostServer.start()
})

Deployments are done using Circle CI which creates a Docker image, pushes it to a Docker registry and deploys it to a staging environment. If everything looks good, the updates are moved to the production blog you are reading now. As a backing database, the blog uses PostgreSQL.

The Node.js Memory Leak

As we like to keep our dependencies up-to-date, we updated to [email protected] as soon as it came out. Once we did this, our alerts started to fire, as memory usage started to grow:

Node.js Memory leak in ghost - Trace memory metrics

Luckily, we had alerts set up for memory usage in Trace, which notified us that something is not right. As Trace integrates with Opsgenie and Pagerduty seamlessly, we could have set up alerts for those channels.

We set up alerts for the blog service at 180 and 220 Mb because usually it consumes around 150 Mb when everything’s all right.

Setting up alerts for Node.js memory leaks in Trace

What was even better, is that the alerting was set up in a way that it triggered actions on the collector level. What does this mean? It means, that Trace could create a memory heapdump automatically, without human intervention. Once we started to investigate the issue, the memory heapdump was already in the Profiler section of Trace in the format that's supported by the Google Chrome DevTools.

This enabled us to start looking at the problem instantly, and in a way it happened in the production system, not by trying to reproduce the issue in a local development environment.

Also, as we could take multiple heapdumps from the application itself, we could compare them using the comparison view of the DevTools.

Memory heapshot comparison with Trace and Chrome's Devtools

How to use the comparison view to find the source of a problem? On the picture above, you can see that I compared the heapdump that Trace automatically collected when the alert was triggered with a heapdump that was requested earlier, when everything was ok with the service.

"You can use 2 heapdumps & Chromes comparison view to investigate a Node.js memory leak!" via @RisingStack #nodejs

Click To Tweet

What you have to look for is the #Delta, which shows +772 in our case. This means that at the time our high memory usage alert was triggered the heapdump had an extra 772 objects in it. On the bottom of the picture you can see what were these elements, and that they have something to do with the lodash module.

"The "delta" shows the extra objects in your memory heapdump in case of a leak!" via @RisingStack #nodejs

Click To Tweet

Figuring this out otherwise would be extremely challenging since you’d have to reproduce the issue in a local environment - which is tricky if you don’t even know what caused it.

Should I update? Well..

The final cause of the leak was found by Katharina Irrgang, a core Ghost contributor. To check out the whole thread you can take a look at the GitHub issue: https://github.com/TryGhost/Ghost/issues/7189 . A fix was shipped with 0.10.1. - but updating to it will cause another issue: slow response times.

Slow Response Times

Once we upgraded to the new version, we ran into a new problem - our blog's response time started to degrade. The 95 percentile grew from 100ms to almost 300ms. It instantly triggered our alerts set for response times.

Slow response time graph from Trace

For the slow response time we started to take CPU profiles using Trace. For now, we are still investigating the exact reason, but so far we suspect something is off with how moment.js is used.

CPU profile analysis with Trace

We will update the post once we found why it happens.

Conclusion

I hope this article helped you to figure out what to do in case you're experiencing memory leaks in your Node.js applications. If you'd like to get memory heapdumps automatically in a case like this, connect your services with Trace and enable alerting just like we did earlier.

If you have any additional questions, you can reach me in the comments section!

Node.js Interactive Europe 2016 Recap

Node.js Interactive Europe took place on 15-16 September 2016 in Amsterdam, The Netherlands. The two days were packed with great talks - let's see what was it about!

Amsterdam is the capital of the Kingdom of the Netherlands, with almost 2.5 million people living in the metropolitan area. It was founded as a small fishing village in the 12th century, then became the most important port during the 17 century. Nowadays the canals of Amsterdam and the Defence Line are on the UNESCO World Heritage List.

Node.js Interactive location: Amsterdam

From RisingStack Kev, Peter and Gergely were at the conference - Peter talked about how we killed the monolith under Trace, while Gergely was a panelist at the “Node.js in Europe” and “Node.js Education” panel discussions.

The First Day

The first day was started with keynotes from Mikeal Rodgers, James Snell, Ashley Williams and Bert Belder.

The State of Node.js - Mikeal Rodgers

Mikeal talked about the current state of Node.js adoption, and let us know that Node.js is everywhere. It started out as a server-side framework in 2009, but it is now used in IoT devices, mobiles and on the desktop as well.

Node.js Interactive Amsterdam - Mikeal Rodgers keynote

The project currently has 87 active contributors. Thanks to them, Node.js v7.0.0 will be released in October. This release won't be an LTS version, so it will be only maintained till July 2017, with Node.js v8.0.0 being published in April 2017.

"Great news from Node Interactive: Node.js v8.0.0 is coming in April 2017." via @RisingStack #nodejs

Click To Tweet

From the community side, we learned that there is a new initiative called NodeTogether. NodeTogether’s aim is to improve the diversity of the Node community by bringing people of underrepresented groups together to learn Node.js

Node.js Interactive Amsterdam - Opening Keynote

On the more technical side, we learned that the inspector will become part of the core itself using node --inspect.

How the Event Loop Works - Bert Belder

Next, Bert Belder told us about how the event loop works in his insightful talk. We are excited to see the video of this talk again!.

Node.js Interactive Amsterdam - How the Event Loop works by Bert Belder

(Illustration from Bert's slides)

Event loop and garbage collector metrics - check out Trace by RisingStack!

The Road Forward - Tracy Hinds

Tracy's topic was on how to make Node more inclusive and diverse. To create a sense of welcome, barriers to access and biases must be surfaced. People must feel safe enough to contribute to the Node.js project.

Node.js and Microservices - Yunong Xiao (Netflix) and Peter Marton (RisingStack)

Then came the microservices talks from Yunong J Xiao and Peter Marton.

Yunong (Netflix) talked about Slaying the monolith, while Peter (RisingStack) talked about Breaking down the monolith. Both the talks arrived at the conclusion, that API Gateways are the way to build microservices.

Node.js Interactive Amsterdam - Yunong Xiao from Netflix

Node.js Interactive Amsterdam - Peter Marton on Trace by RisingStack

The Post-Mortem Diagnostics Group

After lunch, Yunong Xiao and Michael Dawson talked about the post-mortem diagnostics group, and nodereport was just announced! Nodereport is an add-on for Node.js, delivered as an NPM module, which provides a human-readable diagnostic summary report, written to a file.

IoT and Containers with Node.js

During the afternoon there were great talks on IoT, Web Standards, Robots and panel talks on Node.js in Europe, IoT and Containers. Containers are popular in the Node.js Community, which was clearly visible from the number of talks on the topic, or from the results of our Node.js Survey we conducted this summer.

Second day

The second day started with a great talk onsecurity, led by Paul Millhiam. (wildworks). Paul is a lead developer at WildeWorks whose job is to keep millions of kids safe by using Node.js. We have learned how data validation is directly linked to security and how code can be secure and stable to work with.

Eugine Isratti’s session described how anyone can use serverless computing to achieve high performance without huge efforts or resources allocation. Eugine’s work at Mitoc Group demonstrated how developers design full/stack application at low cost and low maintenance.

We also saw great talks on blockchain and on how to build offline applications, followed by more container talks and making native add-ons.

The day closed with panel discussions on learning Node.js and best practices on how to contribute to the core.

Node.js Interactive Amsterdam Venue

After the closing keynote, the audience headed to the Beer.js, which was hosted in a nearby pub by Nodeschool Amsterdam. It was a great opportunity to talk with both other attendees as well as the speakers.

Saturday and Sunday

The Collaboration Summit took place during the weekend. The Summit's goal was to get new developers involved in areas through a series of hands-on workshops led by existing contributors. The hands-on workshops led into "sprints" where new developers tackled real problems in the code base with the support of the mentors in attendance. It was great to see new contributors opening pull requests instantly!

Node.js Interactive - Next Up

Node.js Interactive Europe was a great conference to meet with the ones enabling us to use Node.js with ease and learn how others use it and what challenges they face, what problems they solve. There were a lot of great conversations to join to with a friendly and nice atmosphere. We hope to see you there next year!

The next Node Interactive event will take place in Austin from 29 November. For more information and tickets visit http://events.linuxfoundation.org/events/node-interactive.


How Developers use Node.js - Survey Results

NodeConf Budapest - One day, single track Node.js conference in the heart of Europe. Get your ticket with 20% off using the TRACE discount code!


RisingStack, the provider of Trace - a next-gen Node.js debugging and performance monitoring solution and silver member of the Node Foundation conducted a survey during 2016 Summer to find out how developers use Node.js and what technologies they prefer with it. This article summarizes the results.

The results show that MongoDB, RabbitMQ, AWS, Jenkins, Docker and Amazon Container Services are the go-to choices for developing, containerizing and shipping Node.js applications.

The survey also let us find out various aspects of developing Node.js and choices for async control flow, debugging, continuous integration or finding packages. The results also tell Node developers major pain-point: debugging.

The survey was open for 35 days from 11 July until 15 August 2016. During this period, 1126 Node.js developers completed it. 55% of them have more than two years of Node.js experience, while 26% uses Node between one and two years. 20% works at a company that is publicly traded, 7% at a Fortune 500 enterprise.

Technologies used with Node.js

MongoDB became the go-to database

Node.js Survey - What databases are you using? MongoDB wins.

According to the results, MongoDB is clearly the go-to database for Node.js developers. Roughly ⅔ of our respondents claimed that they use MongoDB with their Node.js applications. It's also worth noticing that the popularity of Redis is massively increasing with the experience of Node engineers. This trend is also true in the case of PostgreSQL and ElasticSearch.

Node.js Survey - Database usage and developer experience

"The popularity of Redis and PostgreSQL is heavily increasing with developer XP" via @RisingStack #nodejs #survey

Click To Tweet


Redis leads as a caching solution, but a lot of developers still don’t do it

Node.js Survey - What do you use for caching? Redis wins.

Half of our respondents said that they are using Redis for caching, but a staggering 45% stated that they don’t use any. Cross referencing the answers with developer experience allows us to see that the popularity of Redis is quite high amongst long-time Node users, compared to engineers with less than one year of Node.js experience.

Node.js Survey - Caching usage and developer experience

The popularity of messaging systems is still low

According to our survey, 58% of Node.js developers don't use any messaging systems. This means that either developers are rarely using messaging in their microservices system, they use REST API-s or they don’t have a sophisticated system in place.

Node.js Survey - What messaging systems are you using? RabbitMQ wins.

Those who use Messaging systems answered that they mostly use RabbitMQ (24% of all respondents). If we only investigate the responses of people who use messaging systems, RabbitMQ beats the rest of the existing solutions by far.

Node.js Survey - Messaging system usage

Node.js apps are most likely running on AWS

According to our survey, 43% of Node.js developers use AWS for running their applications, but running an own datacenter is popular as well (34%) especially amongst enterprises (nearly 50% of them has own datacenters) - but this is no surprise.

Node.js Survey - Where do you run your Node.js apps? AWS.

"43% of Node.js developers use AWS for running their applications" via @RisingStack #nodejs #survey @awscloud

Click To Tweet

What’s interesting though is that Heroku and DigitalOcean are competing neck and neck to become the second biggest cloud platform for Node.js. According to our data, DigitalOcean is more popular with smaller companies (under 50) while Heroku stays strong as an enterprise solution as well.

Node.js Survey - Running apps and company size

Docker dominates in the Node community

Currently, Docker containers are the go-to solution for most of Node.js developers (47% of all respondents claimed to use it - which is 73% of all container tech users in the survey). Docker seems to be equally popular within all company sizes - but advanced developers appear to be using it much more (the ones with over one year’s of experience).

Node.js Survey - What container techs or VMs are you using? Docker.

"Docker containers are the go-to solution for most of Node.js developers." via @RisingStack #nodejs #survey @docker

Click To Tweet

64% of the respondents said that they use some container technology - which means that the popularity of containers rose since the last major Node.js survey from 45% with a significant 20% increase since January 2016.

Node.js Survey - Container techs and developer experience.

Amazon Container Service is the first choice for running containers

Node.js Survey - How do you run your containers? Amazon Container Service wins.

While Amazon Container Service leads as the choice of running containers with Node.js, it’s worth noting that Kubernetes is already on 25% according to our survey, and it seems to be popular especially with enterprise Node.js developers.

Node.js Survey - Running Amazon and Kubernetes with company size

Node.js development

Configuration files are being used more often than environmental variables

The majority of Node developers (59% vs. 38%) prefer config files over credentials. Only 29 respondents (3%) stated that they use both.

Node.js Survey - Environment variables or config files? Config files wins.

Using only configuration files suggests a possible security problem since it implies that credentials are stored in the repositories. If you have the credentials to productions systems in GitHub, you can quickly run into trouble with rogue developers.

Using Env vars is highly recommended for secrets - while developers can still use config files in general.

Promises lead with async control flow

In Node.js - most of the core libraries are working with callbacks. The results show that Node.js users are leaning towards using promises right now.

Node.js Survey - What do you use for async control flow? Promises wins.

Around half a year ago there was a pull-request in the core Node.js repository asking for async functions to return a native Promise. The answer for this was: “A Promises API doesn’t make sense for core right now because it's too early in the evolution of V8-based promises and their relationship to other ES* features. There is tiny interest within the TC in exploring this in the core in the short-term.”

Maybe it’s time to revisit the issue - since the demand is present.

Developers trust the console.log for debugging

Console.log is leading the race amongst other debugging solutions like the Node Inspector, the Built-in debugger and the debug module. Around ¾ of Node developers use it for finding errors in their applications - while much-sophisticated solutions are available as well.

Node.js Survey - How do you debug your applications? Using the console.log

"Around ¾ of Node developers use console.log to find errors in their applications." via @RisingStack #nodejs #survey

Click To Tweet

A closer look at the data lets us know that more experienced developers are leaning towards the Node Inspector and the Debug Module as well.

Node.js Survey - Debugging applications and developer experience

APM’s are still quite unpopular in the Node.js community

According to the responses in our survey, only ¼ of Node.js developers use APMs - application performance monitoring tools - to identify issues in their applications. Although, emerging trends in the dataset suggest that APM usage grows with company size and developer experience.

Node.js Survey - How do you identify issues in your app? Using logs.

SaaS CI’s still have a low market share in the Node community

According to the answers in our survey, using shell scripts is the most popular way of pushing code to staging or production environments - but Jenkins clearly wins among continuous delivery and integration platforms so far, and is becoming more popular as company size increases.

Node.js Survey - What do you use to push code or containers? Shell scripts win.

Node.js developers rarely update dependencies

Frequently updating dependencies is highly recommended with Node.js applications - since around 15% of npm packages carry a known vulnerability & 76% of Node shops use vulnerable dependencies according to a recent survey.

Node.js Survey - How often do you update dependencies? Less frequently than a month.

Updating dependencies less frequently than every week exposes applications to severe attacks all the time. According to our survey, 45% of Node.js developers update dependencies less frequently than a month, and 27% of them update dependencies month-by-month. Only 28% answered that they update dependencies at least every week.

These numbers correlate neither with company size nor with developer experience.

Node.js developers Google for their packages

According to our survey, the majority of developers use Google to find packages and decide which one of them they should use. Although the popularity of the npmjs.org/npms.io search platforms is 56% amongst our respondents, the data shows that it goes up to almost 70% for the demographic group of experienced (more than four years of Node development)! Preference increases with experience in this case.

Node.js Survey - How do you decide what package to pick? People mostly Google for them.

Junior Node.js developers don’t know what semantic versioning is

Although 71% of our respondents uses semantic versioning when publishing/consuming modules, this number should be higher in our opinion. Everyone should use semantic versioning since npm works with semver! Updating packages without using it can easily break Node.js applications.

Node.js Survey - Do you use semantic versioning? Mostly yes.

If we dig deeper in the dataset we can see that around half of the Node developers with less than a year experience don’t know what semver is or don’t use it - while advanced developers are embracing it on a much higher level.

Node.js teams introduce new tools and technologies very fast

According to our survey 35% of Node developers can introduce new tech/tools/product to their companies in a few days, and 29% in just a few weeks.

Node.js Survey - How much time is needed to introduce new technologies, tools or products to your company? A few weeks.

"35% of Node devs can introduce new tech/tools to their companies in a few days." via @RisingStack #nodejs #survey

Click To Tweet

If we investigate the data more thoroughly, a not-so-surprising pattern emerges which lets us know that the time needed to introduce new tech/tools is gradually increasing with the size of a company.

Debugging is the most severe pain-point for developing with Node.js

We also asked Node developers about what's their biggest pain points regarding development. The top answers were:

  • Debugging / Profiling / Performance Monitoring
  • Callbacks and Callback hell
  • Understanding Async programming
  • Dependency management
  • Lack of conventions/best practices
  • Structuring
  • Bad documentation
  • Finding the right packages

"The biggest pain point of Node.js development is debugging." via @RisingStack #nodejs #survey

Click To Tweet

Conclusion

Developing Node.js is still an interesting and ever-changing experience. We'd like to thank for the the engineers who took their time with answering to our questions, and we hope that the information presented in the article is valuable for the whole Node community.

The full dataset is going to be released and linked in this blogpost in a few days.


Writing a JavaScript Framework - Introduction to Data Binding, beyond Dirty Checking

This is the fourth chapter of the Writing a JavaScript framework series. In this chapter, I am going to explain the dirty checking and the accessor data binding techniques and point out their strengths and weaknesses.

The series is about an open-source client-side framework, called NX. During the series, I explain the main difficulties I had to overcome while writing the framework. If you are interested in NX please visit the home page.

The series includes the following chapters:

  1. Project structuring
  2. Execution timing
  3. Sandboxed code evaluation
  4. Data binding introduction (current chapter)
  5. Data binding with ES6 Proxies
  6. Custom elements
  7. Client side routing

An introduction to data binding

Data binding is a general technique that binds data sources from the provider and consumer together and synchronizes them.

This is a general definition, which outlines the common building blocks of data binding techniques.

  • A syntax to define the provider and the consumer.
  • A syntax to define which changes should trigger synchronization.
  • A way to listen to these changes on the provider.
  • A synchronizing function that runs when these changes happen. I will call this function the handler() from now on.

The above steps are implemented in different ways by the different data binding techniques. The upcoming sections will be about two such techniques, namely dirty checking and the accessor method. Both has their strengths and weaknesses, which I will briefly discuss after introducing them.

Dirty checking

Dirty checking is probably the most well-known data binding method. It is simple in concept, and it doesn't require complex language features, which makes it a nice candidate for legacy usage.

The syntax

Defining the provider and the consumer doesn't require any special syntax, just plain Javascript objects.

const provider = {  
  message: 'Hello World'
}
const consumer = document.createElement('p')  

Synchronization is usually triggered by property mutations on the provider. Properties, which should be observed for changes must be explicitly mapped with their handler().

observe(provider, 'message', message => {  
  consumer.innerHTML = message
})

The observe() function simply saves the (provider, property) -> handler mapping for later use.

function observe (provider, prop, handler) {  
  provider._handlers[prop] = handler
}

With this, we have a syntax for defining the provider and the consumer and a way to register handler() functions for property changes. The public API of our library is ready, now comes the internal implementation.

Listening on changes

Dirty checking is called dirty for a reason. It runs periodical checks instead of listening on property changes directly. Let's call this check a digest cycle from now on. A digest cycle iterates through every (provider, property) -> handler entry added by observe() and checks if the property value changed since the last iteration. If it did change, it runs the handler() function. A simple implementation would look like below.

function digest () {  
  providers.forEach(digestProvider)
}

function digestProvider (provider) {  
  for (let prop in provider._handlers) {
    if (provider._prevValues[prop] !== provider[prop]) {
      provider._prevValues[prop] = provider[prop]
      handler(provider[prop])
    }
  }
}

The digest() function needs to be run from time to time to ensure a synchronized state.

The accessor technique

The accessor technique is the now trending one. It is a bit less widely supported as it requires the ES5 getter/setter functionality, but it makes up for this in elegance.

The syntax

Defining the provider requires special syntax. The plain provider object has to be passed to the observable() function, which transforms it into an observable object.

const provider = observable({  
  greeting: 'Hello',
  subject: 'World'
})
const consumer = document.createElement('p')  

This small inconvenience is more than compensated by the simple handler() mapping syntax. With dirty checking, we would have to define every observed property explicitly like below.

observe(provider, 'greeting', greeting => {  
  consumer.innerHTML = greeting + ' ' + provider.subject
})

observe(provider, 'subject', subject => {  
  consumer.innerHTML = provider.greeting + ' ' + subject
})

This is verbose and clumsy. The accessor technique can automatically detect the used provider properties inside the handler() function, which allows us to simplify the above code.

observe(() => {  
  consumer.innerHTML = provider.greeting + ' ' + provider.subject
})

The implementation of observe() is different from the dirty checking one. It just executes the passed handler() function and flags it as the currently active one while it is running.

let activeHandler

function observe(handler) {  
  activeHandler = handler
  handler()
  activeHandler = undefined
}

Note that we exploit the single-threaded nature of JavaScript here by using the single activeHandler variable to keep track of the currently running handler() function.

Listening on changes

This is where the 'accessor technique' name comes from. The provider is augmented with getters/setters, which do the heavy lifting in the background. The idea is to intercept the get/set operations of the provider properties in the following way.

  • get: If there is an activeHandler running, save the (provider, property) -> activeHandler mapping for later use.
  • set: Run all handler() functions, which are mapped with the (provide, property) pair.

The accessor data binding technique.

The following code demonstrates a simple implementation of this for a single provider property.

function observableProp (provider, prop) {  
  const value = provider[prop]
  Object.defineProperty(provider, prop, {
    get () {
      if (activeHandler) {
        provider._handlers[prop] = activeHandler
      }
      return value
    },
    set (newValue) {
      value = newValue
      const handler = obj._handlers[prop]
      if (handler) {
        activeHandler = handler
        handler()
        activeHandler = undefined
      }
    }
  })
}

The observable() function mentioned in the previous section walks the provider properties recursively and converts all of them into observables with the above observableProp() function.

function observable (provider) {  
  for (let prop in provider) {
    observableProp(provider, prop)
    if (typeof provider[prop] === 'object') {
      observable(provider[prop])
    }
  }
}

This is a very simple implementation, but it is enough for a comparison between the two techniques.

Comparison of the techniques

In this section, I will briefly outline the strengths and weaknesses of dirty checking and the accessor technique.

Syntax

Dirty checking requires no syntax to define the provider and consumer, but mapping the (provider, property) pair with the handler() is clumsy and not flexible.

The accessor technique requires the provider to be wrapped by the observable() function, but the automatic handler() mapping makes up for this. For large projects with data binding, it is a must have feature.

Performance

Dirty checking is notorious for its bad performance. It has to check every (provider, property) -> handler entry possibly multiple times during every digest cycle. Moreover, it has to grind even when the app is idle, since it can't know when the property changes happen.

The accessor method is faster, but performance could be unnecessarily degraded in case of big observable objects. Replacing every property of the provider by accessors is usually an overkill. A solution would be to build the getter/setter tree dynamically when needed, instead of doing it ahead in one batch. Alternatively, a simpler solution is wrapping the unneeded properties with a noObserve() function, that tells observable() to leave that part untouched. This sadly introduces some extra syntax.

Flexibility

Dirty checking naturally works with both expando (dynamically added) and accessor properties.

The accessor technique has a weak spot here. Expando properties are not supported because they are left out of the initial getter/setter tree. This causes issues with arrays for example, but it can be fixed by manually running observableProp() after adding a new property. Getter/setter properties are neither supported since accessors can't be wrapped by accessors again. A common workaround for this is using a computed() function instead of a getter. This introduces even more custom syntax.

Timing alternatives

Dirty checking doesn't give us much freedom here since we have no way of knowing when the actual property changes happen. The handler() functions can only be executed asynchronously, by running the digest() cycle from time to time.

Getters/setters added by the accessor technique are triggered synchronously, so we have a freedom of choice. We may decide to run the handler() right away, or save it in a batch that is executed asynchronously later. The first approach gives us the advantage of predictability, while the latter allows for performance enhancements by removing duplicates.

About the next article

In the next article, I will introduce the nx-observe data binding library and explain how to replace ES5 getters/setters by ES6 Proxies to eliminate most of the accessor technique's weaknesses.

Conclusion

If you are interested in the NX framework, please visit the home page. Adventurous readers can find the NX source code in this Github repository.

I hope you found this a good read, see you next time when I’ll discuss data binding with ES6 Proxies!

If you have any thoughts on the topic, please share them in the comments.