Node.js Development & Consulting

Hire our experts

Try our Node.js monitoring tool

trace

Announcing Free Node.js Monitoring & Debugging with Trace

Announcing Free Node.js Monitoring & Debugging with Trace

Today, we’re excited to announce that Trace, our Node.js monitoring & debugging tool is now free for open-source projects.

What is Trace?

We launched Trace a year ago with the intention of helping developers looking for a Node.js specific APM which is easy to use and helps with the most difficult aspects of building Node projects, like..

  • finding memory leaks in a production environment
  • profiling CPU usage to find bottlenecks
  • tracing distributed call-chains
  • avoiding security leaks & bad npm packages

.. and so on.

Node.js Monitoring with Trace by RisingStack - Performance Metrics chart

Why are we giving it away for free?

We use a ton of open-source technology every day, and we are also the maintainers of some.

We know from experience that developing an open-source project is hard work, which requires a lot of knowledge and persistence.

Trace will save a lot of time for those who use Node for their open-source projects.

How to get started with Trace?

  1. Visit trace.risingstack.com and sign up - it's free.
  2. Connect your app with Trace.
  3. Head over to this form and tell us a little bit about your project.

Done. Your open-source project will be monitored for free as a result.

If you need help with Node.js Monitoring & Debugging..

Just drop us a tweet at @RisingStack if you have any additional questions about the tool or the process.

If you'd like to read a little bit more about the topic, I recommend to read our previous article The Definitive Guide for Monitoring Node.js Applications.

One more thing

At the same time of making Trace available for open-source projects, we're announcing our new line of business at RisingStack:

Commercial Node.js support, aimed at enterprises with Node.js applications running in a production environment.

RisingStack now helps to bootstrap and operate Node.js apps - no matter what life cycle they are in.


Disclaimer: We retain the exclusive right to accept or deny your application to use Trace by RisingStack for free.

Node.js War Stories: Debugging Issues in Production

Node.js War Stories: Debugging Issues in Production

In this article, you can read stories from Netflix, RisingStack & nearForm about Node.js issues in production - so you can learn from our mistakes and avoid repeating them. You'll also learn what methods we used to debug these Node.js issues.

Special shoutout to Yunong Xiao of Netflix, Matteo Collina of nearForm & Shubhra Kar from Strongloop for helping us with their insights for this post!


At RisingStack, we have accumulated a tremendous experience of running Node apps in production in the past 4 years - thanks to our Node.js consulting, training and development business.

As well as the Node teams at Netflix & nearForm we picked up the habit of always writing thorough postmortems, so the whole team (and now the whole world) could learn from the mistakes we made.

Netflix & Debugging Node: Know your Dependencies

Let's start with a slowdown story from Yunong Xiao, which happened with our friends at Netflix.

The trouble started with the Netflix team noticing that their applications response time increased progressively - some of their endpoints' latency increased with 10ms every hour.

This was also reflected in the growing CPU usage.

Netflix debugging Nodejs in production with the Request latency graph Request latencies for each region over time - photo credit: Netflix

At first, they started to investigate whether the request handler is responsible for slowing things down.

After testing it in isolation, it turned out that the request handler had a constant response time around 1ms.

So the problem was not that, and they started to suspect that probably it's deeper in the stack.

The next thing Yunong & the Netflix team tried are CPU flame graphs and Linux Perf Events.

Flame graph of Netflix Nodejs slowdown Flame graph or the Netflix slowdown - photo credit: Netflix

What you can see in the flame graph above is that

  • it has high stacks (which means lot of function calls)
  • and the boxes are wide (meaning we are spending quite some time in those functions).

After further inspection, the team found that Express's router.handle and router.handle.next has lots of references.

The Express.js source code reveals a couple of interesting tidbits:

  • Route handlers for all endpoints are stored in one global array.
  • Express.js recursively iterates through and invokes all handlers until it finds the right route handler.

Before revealing the solution of this mystery, we have to get one more detail:

Netflix's codebase contained a periodical code that ran every 6 minutes and grabbed new route configs from an external resource and updated the application's route handlers to reflect the changes.

This was done by deleting old handlers and adding new ones. Accidentally, it also added the same static handler all over again - even before the API route handlers. As it turned out, this caused the extra 10ms response time hourly.

Takeaways from Netflix's Issue

  • Always know your dependencies - first, you have to fully understand them before going into production with them.
  • Observability is key - flame graphs helped the Netflix engineering team to get to the bottom of the issue.

Read the full story here: Node.js in Flames.


Expert help when you need it the most

Commercial Node.js Support by RisingStack
Learn more


RisingStack CTO: "Crypto takes time"

You may have already heard to story of how we broke down the monolithic infrastructure of Trace (our Node.js monitoring solution) into microservices from our CTO, Peter Marton.

The issue we'll talk about now is a slowdown which affected Trace in production:

As the very first versions of Trace ran on a PaaS, it used the public cloud to communicate with other services of ours.

To ensure the integrity of our requests, we decided to sign all of them. To do so, we went with Joyent's HTTP signing library. What's really great about it, is that the request module supports HTTP signature out of the box.

This solution was not only expensive, but it also had a bad impact on our response times.

network delay in nodejs request visualized by trace The network delay built up our response times - photo: Trace

As you can see on the graph above, the given endpoint had a response time of 180ms, however from that amount, 100ms was just the network delay between the two services alone.

As the first step, we migrated from the PaaS provider to use Kubernetes. We expected that our response times would be a lot better, as we can leverage internal networking.

We were right - our latency improved.

However, we expected better results - and a lot bigger drop in our CPU usage. The next step was to do CPU profiling, just like the guys at Netflix:

crypto sign function taking up cpu time

As you can see on the screenshot, the crypto.sign function takes up most of the CPU time, by consuming 10ms on each request. To solve this, you have two options:

  • if you are running in a trusted environment, you can drop request signing,
  • if you are in an untrusted environment, you can scale up your machines to have stronger CPUs.

Takeaways from Peter Marton

  • Latency in-between your services has a huge impact on user experience - whenever you can, leverage internal networking.
  • Crypto can take a LOT of time.

nearForm: Don't block the Node.js Event Loop

React is more popular than ever. Developers use it for both the frontend and the backend, or they even take a step further and use it to build isomorphic JavaScript applications.

However, rendering React pages can put some heavy load on the CPU, as rendering complex React components is CPU bound.

When your Node.js process is rendering, it blocks the event loop because of its synchronous nature.

As a result, the server can become entirely unresponsive - requests accumulate, which all puts load on the CPU.

What can be even worse is that even those requests will be served which no longer have a client - still putting load on the Node.js application, as Matteo Collina of nearForm explains.

It is not just React, but string operations in general. If you are building JSON REST APIs, you should always pay attention to JSON.parse and JSON.stringify.

As Shubhra Kar from Strongloop (now Joyent) explained, parsing and stringifying huge payloads can take a lot of time as well (and blocking the event loop in the meantime).

function requestHandler(req, res) {  
  const body = req.rawBody
  let parsedBody
  try {
    parsedBody = JSON.parse(body)
  }
  catch(e) {
     res.end(new Error('Error parsing the body'))
  }
  res.end('Record successfully received')
}

Simple request handler

The example above shows a simple request handler, which just parses the body. For small payloads, it works like a charm - however, if the JSON's size can be measured in megabytes, the execution time can be seconds instead of milliseconds. The same applies for JSON.stringify.

To mitigate these issues, first, you have to know about them. For that, you can use Matteo's loopbench module, or Trace's event loop metrics feature.

With loopbench, you can return a status code of 503 to the load balancer, if the request cannot be fulfilled. To enable this feature, you have to use the instance.overLimit option. This way ELB or NGINX can retry it on a different backend, and the request may be served.

Once you know about the issue and understand it, you can start working on fixing it - you can do it either by leveraging Node.js streams or by tweaking the architecture you are using.

Takeaways from nearForm

  • Always pay attention to CPU bound operations - the more you have, to more pressure you put on your event loop.
  • String operations are CPU-heavy operations

Debugging Node.js Issues in Production

I hope these examples from Netflix, RisingStack & nearForm will help you to debug your Node.js apps in Production.

If you'd like to learn more, I recommend checking out these recent posts which will help you to deepen your Node knowledge:

If you have any questions, please let us know in the comments!

Node Hero - Monitoring Node.js Applications

Node Hero - Monitoring Node.js Applications

This article is the 13th part of the tutorial series called Node Hero - in these chapters, you can learn how to get started with Node.js and deliver software products using it.

In the last article of the series, I’m going to show you how to do Node.js monitoring and how to find advanced issues in production environments.

The Importance of Node.js Monitoring

Getting insights into production systems is critical when you are building Node.js applications! You have an obligation to constantly detect bottlenecks and figure out what slows your product down.

An even greater issue is to handle and preempt downtimes. You must be notified as soon as they happen, preferably before your customers start to complain. Based on these needs, proper monitoring should give you at least the following features and insights into your application's behavior:

  • Profiling on a code level: You have to understand how much time does it take to run each function in a production environment, not just locally.

  • Monitoring network connections: If you are building a microservices architecture, you have to monitor network connections and lower delays in the communication between your services.

  • Performance dashboard: Knowing and constantly seeing the most important performance metrics of your application is essential to have a fast, stable production system.

  • Real-time alerting: For obvious reasons, if anything goes down, you need to get notified immediately. This means that you need tools that can integrate with Pagerduty or Opsgenie - so your DevOps team won’t miss anything important.

"Getting insights into production systems is critical when you are building #nodejs applications" via @RisingStack

Click To Tweet

Server Monitoring versus Application Monitoring

One concept developers usually apt to confuse is monitoring servers and monitoring the applications themselves. As we tend to do a lot of virtualization, these concepts should be treated separately, as a single server can host dozens of applications.

Let’s go trough the major differences!

Server Monitoring

Server monitoring is responsible for the host machine. It should be able to help you answer the following questions:

  • Does my server have enough disk space?
  • Does it have enough CPU time?
  • Does my server have enough memory?
  • Can it reach the network?

For server monitoring, you can use tools like zabbix.

Application Monitoring

Application monitoring, on the other hand, is responsible for the health of a given application instance. It should let you know the answers to the following questions:

  • Can an instance reach the database?
  • How much request does it handle?
  • What are the response times for the individual instances?
  • Can my application serve requests? Is it up?

For application monitoring, I recommend using our tool called Trace. What else? :)

We developed it to be an easy to use and efficient tool that you can use to monitor and debug applications from the moment you start building them, up to the point when you have a huge production app with hundreds of services.

How to Use Trace for Node.js Monitoring

To get started with Trace, head over to https://trace.risingstack.com and create your free account!

Once you registered, follow these steps to add Trace to your Node.js applications. It only takes up a minute - and these are the steps you should perform:

Start Node.js monitoring with these steps

Easy, right? If everything went well, you should see that the service you connected has just started sending data to Trace:

Reporting service in Trace for Node.js Monitoring

#1: Measure your performance

As the first step of monitoring your Node.js application, I recommend to head over to the metrics page and check out the performance of your services.

Basic Node.js performance metrics

  • You can use the response time panel to check out median and 95th percentile response data. It helps you to figure out when and why your application slows down and how it affects your users.
  • The throughput graph shows request per minutes (rpm) for status code categories (200-299 // 300-399 // 400-499 // >500 ). This way you can easily separate healthy and problematic HTTP requests within your application.
  • The memory usage graph shows how much memory your process uses. It’s quite useful for recognizing memory leaks and preempting crashes.

Advanced Node.js Monitoring Metrics

If you’d like to see special Node.js metrics, check out the garbage collection and event loop graphs. Those can help you to hunt down memory leaks. Read our metrics documentation.

#2: Set up alerts

As I mentioned earlier, you need a proper alerting system in action for your production application.

Go the alerting page of Trace and click on Create a new alert.

  • The most important thing to do here is to set up downtime and memory alerts. Trace will notify you on email / Slack / Pagerduty / Opsgenie, and you can use Webhooks as well.

  • I recommend setting up the alert we call Error rate by status code to know about HTTP requests with 4XX or 5XX status codes. These are errors you should definitely care about.

  • It can also be useful to create an alert for Response time - and get notified when your app starts to slow down.

#3: Investigate memory heapdumps

Go to the Profiler page and request a new memory heapdump, wait 5 minutes and request another. Download them and open them on Chrome DevTool’s Profiles page. Select the second one (the most recent one), and click Comparison.

chrome heap snapshot for finding a node.js memory leak

With this view, you can easily find memory leaks in your application. In a previous article I’ve written about this process in a detailed way, you can read it here: Hunting a Ghost - Finding a Memory Leak in Node.js

#4: CPU profiling

Profiling on the code level is essential to understand how much time does your function take to run in the actual production environment. Luckily, Trace has this area covered too.

All you have to do is to head over to the CPU Profiles tab on the Profiling page. Here you can request and download a profile which you can load into the Chrome DevTool as well.

CPU profiling in Trace

Once you loaded it, you'll be able to see the 10 second timeframe of your application and see all of your functions with times and URL's as well.

With this data, you'll be able to figure out what slows down your application and deal with it!

Download the whole Node Hero series as a single pdf

The End

Update: as a sequel to Node Hero, we have started a new series called Node.js at Scale. Check it out if you are interested in more in-depth articles!

This is it.

During the 13 episodes of the Node Hero series, you learned the basics of building great applications with Node.js.

I hope you enjoyed it and improved a lot! Please share this series with your friends if you think they need it as well - and show them Trace too. It’s a great tool for Node.js development!

If you have any questions regarding Node.js monitoring, let me know in the comments section!


Node Hero - Debugging Node.js Applications

Node Hero - Debugging Node.js Applications

This article is the 10th part of the tutorial series called Node Hero - in these chapters, you can learn how to get started with Node.js and deliver software products using it.

In this tutorial, you are going to learn debugging your Node.js applications using the debug module, the built-in Node debugger and Chrome's developer tools.

Bugs, debugging

The term bug and debugging have been a part of engineering jargon for many decades. One of the first written mentions of bugs is as follows:

It has been just so in all of my inventions. The first step is an intuition, and comes with a burst, then difficulties arise — this thing gives out and [it is] then that "Bugs" — as such little faults and difficulties are called — show themselves and months of intense watching, study and labor are requisite before commercial success or failure is certainly reached.

Thomas Edison


Debugging Node.js Applications

One of the most frequently used approach to find issues in Node.js applications is the heavy usage of console.log for debugging.

"Console.log is efficient for debugging small snippets but we recommend better alternatives!” @RisingStack #nodejs

Click To Tweet

Let's take a look at them!

The debug module

Some of the most popular modules that you can require into your project come with the debug module. With this module, you can enable third-party modules to log to the standard output, stdout. To check whether a module is using it, take a look at the package.json file's dependency section.

To use the debug module, you have to set the DEBUG environment variable when starting your applications. You can also use the * character to wildcard names. The following line will print all the express related logs to the standard output.

DEBUG=express* node app.js  

The output will look like this:

Logfile made with the Node debug module

Node.js Monitoring and Debugging from the Experts of RisingStack

Build performant applications using Trace
Learn more

The Built-in Node.js Debugger

Node.js includes a full-featured out-of-process debugging utility accessible via a simple TCP-based protocol and built-in debugging client.

To start the built-in debugger you have to start your application this way:

node debug app.js  

Once you have done that, you will see something like this:

using the built-in node.js debugger

Basic Usage of the Node Debugger

To navigate this interface, you can use the following commands:

  • c => continue with code execution
  • n => execute this line and go to next line
  • s => step into this function
  • o => finish function execution and step out
  • repl => allows code to be evaluated remotely

You can add breakpoints to your applications by inserting the debugger statement into your codebase.

function add (a, b) {  
  debugger
  return a + b
}

var res = add('apple', 4)  


Watchers

It is possible to watch expression and variable values during debugging. On every breakpoint, each expression from the watchers list will be evaluated in the current context and displayed immediately before the breakpoint's source code listing.

To start using watchers, you have to define them for the expressions you want to watch. To do so, you have to do it this way:

watch('expression')  

To get a list of active watchers type watchers, to unwatch an expression use unwatch('expression').

Pro tip: you can switch running Node.js processes into debug mode by sending the SIGUSR1 command to them. After that you can connect the debugger with node debug -p <pid>.

"You can switch running #nodejs processes into debug mode by sending the SIGUSR1 command to them.” via @RisingStack

Click To Tweet

To understand the full capabilities of the built-in debugger, check out the official API docs: https://nodejs.org/api/debugger.html.

The Chrome Debugger

When you start debugging complex applications, something visual can help. Wouldn’t be great to use the familiar UI of the Chrome DevTools for debugging Node.js applications as well?

node.js chrome developer tools debugging

Good news, the Chrome debug protocol is already ported into a Node.js module and can be used to debug Node.js applications.

To start using it, you have to install node-inspector first:

npm install -g node-inspector  

Once you installed it, you can start debugging your applications by starting them this way:

node-debug index.js --debug-brk  

(the --debug-brk pauses the execution on the first line)

It will open up the Chrome Developer tools and you can start to debug your Node.js applications with it.

Download the whole Node Hero series as a single pdf

Next up

Debugging is not that hard after all, is it?

In the next chapter of Node Hero, you are going to learn how to secure your Node.js applications.

If you have any questions or recommendations for this topic, write them in the comments section.


Introducing Distributed Tracing for Microservices Monitoring

Introducing Distributed Tracing for Microservices Monitoring

At RisingStack, as an enterprise Node.js development and consulting company, we have been working tirelessly in the past two years to build durable and efficient microservices architectures for our clients and as being passionate advocates of this technology.

During this period, we had to face the cold fact that there aren’t proper tools able to support microservices architectures and the developers working with them. Monitoring, debugging and maintaining distributed systems is still extremely challenging.

We want to change this because doing microservices shouldn’t be so hard.

I am proud to announce that Trace - our microservices monitoring tool has entered the Open Beta stage and is available to use for free with Node.js services from now on.

Trace provides:

  • A Distributed Trace view for all of your transactions with error details
  • Service Map to see the communication between your microservices
  • Metrics on CPU, memory, RPM, response time, event loop and garbage collection
  • Alerting with Slack, Pagerduty, and Webhook integration

Trace makes application-level transparency available on a large microservices system with very low overhead. It will also help you to localize production issues faster to debug and monitor applications with ease.

You can use Trace in any IaaS or PaaS environment, including Amazon AWS, Heroku or DigitalOcean. Our solution currently supports Node.js only, but it will be available for other languages later as well. The open beta program lasts until 1 July.

Get started with Trace for free

Read along to get details on the individual features and on how Trace works.

Distributed Tracing

The most important feature of Trace is the transaction view. By using this tool, you can visualize every transaction going through your infrastructure on a timeline - in a very detailed way.

Distributed Tracing View Trace by Risingstack

By attaching a correlation ID to certain requests, Trace groups services taking part in a transaction and visualizes the exact data-flow on a simple tree-graph. Thanks to this you can see the distributed call stacks and the dependencies between your microservices and see where a request takes the most time.

This approach also lets you to localize ongoing issues and show them on the graph. Trace provides detailed feedback on what caused an error in a transaction and gives you enough data to start debugging your system instantly.

Distributed Tracing with Detailed Error Message

When a service causes an error in a distributed system, usually all of the services taking part in that transaction will throw an error, and it is hard to figure out which one really caused the trouble in the first place. From now on, you won’t need to dig through log files to find the answer.

With Trace, you can instantly see what was the path of a certain request, what services were involved, and what caused the error in your system.

The technology Trace uses is primarily based on Google’s Dapper whitepaper. Read the whole study to get the exact details.

Microservices Topology

Trace automatically generates a dynamic service map based on how your services communicate with each other or with databases and external APIs. In this view, we provide feedback on infrastructure health as well, so you will get informed when something begins to slow down or when a service starts to handle an increased amount of requests.

Distributed Tracing with Service Topology Map

The service topology view also allows you to immediately get a sense of how many requests your microservices handle in a given period and how big are their response times.

By getting this information you can see how your application looks like and understand the behavior of your microservices architecture.

Metrics and Alerting

Trace provides critical metrics data for each of your monitored services. Other than basics like CPU usage, memory usage, throughput and response time, our tool reports event loop and garbage collection metrics as well to make microservices development and operations easier.

Distributed Tracing with Metrics and Alerting

You can create alerts and get notified when a metric passes warning or error thresholds so you can act immediately. Trace will alert you via Slack, Pagerduty, Email or Webhook.

Give Microservices Monitoring a Try

Adding Trace to your services is possible with just a couple lines of code, and it can be installed and used in under two minutes.

Click to sign up for Trace

We are curious on your feedback on Trace and on the concept of distributed transaction tracking, so don’t hesitate to express your opinion in the comment section.

Hunting a Ghost - Finding a Memory Leak in Node.js

Hunting a Ghost - Finding a Memory Leak in Node.js

Finding a Node.js memory leak can be quite challenging - recently we had our fair share of it.

One of our client's microservices started to produce the following memory usage:


Node.js memory leak in Trace

Memory usage grabbed with Trace by RisingStack - our Node.js Performance monitoring and debugging tool

You may spend quite a few days on things like this: profiling the application and looking for the root cause. In this post, I would like to summarize what tools you can use and how, so you can learn from it.

The TL;DR version

In our particular case the service was running on a small instance, with only 512MB of memory. As it turned out, the application didn't leak any memory, simply the GC didn't start collecting unreferenced objects.

Why was it happening? As a default, Node.js will try to use about 1.5GBs of memory, which has to be capped when running on systems with less memory. This is the expected behaviour as garbage collection is a very costly operation.

The solution for it was adding an extra parameter to the Node.js process:

node --max_old_space_size=400 server.js --production  

Still, if it is not this obvious, what are your options to find memory leaks?



Monitor your memory usage and find leaks with ease. - check out Trace by RisingStack!


Understanding V8's Memory Handling

Before diving into the technics that you can employ to find and fix memory leaks in Node.js applications, let's take a look at how memory is handled in V8.

Definitions
  • resident set size: is the portion of memory occupied by a process that is held in the RAM, this contains:
    • the code itself
    • the stack
    • the heap
  • stack: contains primitive types and references to objects
  • heap: stores reference types, like objects, strings or closures
  • shallow size of an object: the size of memory that is held by the object itself
  • retained size of an object: the size of the memory that is freed up once the object is deleted along with its' dependent objects
How The Garbage Collector Works

Garbage collection is the process of reclaiming the memory occupied by objects that are no longer in use by the application. Usually, memory allocation is cheap while it's expensive to collect when the memory pool is exhausted.

An object is a candidate for garbage collection when it is unreachable from the root node, so not referenced by the root object or any other active objects. Root objects can be global objects, DOM elements or local variables.

The heap has two main segments, the New Space and the Old Space. The New Space is where new allocations are happening; it is fast to collect garbage here and has a size of ~1-8MBs. Objects living in the New Space are called Young Generation. The Old Space where the objects that survived the collector in the New Space are promoted into - they are called the Old Generation. Allocation in the Old Space is fast, however collection is expensive so it is infrequently performed .

Why is garbage collection expensive? The V8 JavaScript engine employs a stop-the-world garbage collector mechanism. In practice, it means that the program stops execution while garbage collection is in progress.

Usually, ~20% of the Young Generation survives into the Old Generation. Collection in the Old Space will only commence once it is getting exhausted. To do so the V8 engine uses two different collection algorithms:

  • Scavenge collection, which is fast and runs on the Young Generation,
  • Mark-Sweep collection, which is slower and runs on the Old Generation.

For more information on how this works check out the A tour of V8: Garbage Collection article. For more information on general memory management, visit the Memory Management Reference.

Tools / Technics You Can Use to Find a Memory Leak in Node.js

The heapdump module

With the heapdump module, you can create a heap snapshot for later inspection. Adding it to your project is as easy as:

npm install heapdump --save  

Then in your entry point just add:

var heapdump = require('heapdump');  

Once you are done with it, you can start collecting heapdump with either using the $ kill -USR2 <pid> command or by calling:

heapdump.writeSnapshot(function(err, filename) {  
  console.log('dump written to', filename);
});

Once you have your snapshots, it's time to make sense of them. Make sure you capture multiple of them with some time difference so you can compare them.

Google Chrome DevTools

First you have to load your memory snapshots into the Chrome profiler. To do so, open up Chrome DevTools, go to profiles and Load your heap snapshots.

find a node.js memory leak with chrome load profiles

Once you loaded them it should be something like this:

chrome heap snapshot for finding a node.js memory leak

So far so good, but what can be seen exactly in this screenshot?

One of the most important things here to notice is the selected view: Comparison. This mode enables you to compare two (or more) heap snapshots taken at different times, so you can pinpoint exactly what objects were allocated and not freed up in the meantime.

The other important tab is the Retainers. It shows exactly why an object cannot be garbage collected, what is holding a reference to it. In this case the global variable called log is holding a reference to the object itself, preventing the garbage collector to free up space.

Low-Level Tools

mdb

The mdb utility is an extensible utility for low-level debugging and editing of the live operating system, operating system crash dumps, user processes, user process core dumps, and object files.

gcore

Generate a core dump of a running program with process ID pid.

Putting it together

To investigate dumps, first we have to create one. You can easily do so with:

gcore `pgrep node`  

After you have it, you can search for the all the JS Objects on the heap using:

> ::findjsobjects

Of course, you have to take successive core dumps so that you can compare different dumps.

Once you identified objects that look suspicious, you can analyze them using:

object_id::jsprint  

Now all you have to do is find the retainer of the object (the root).

object_id::findjsobjects -r  

This command will return with id of the retainer. Then you can use ::jsprint again to analyze the retainer.

For a detailed version check out Yunong Xiao's talk from Netflix on how to use it:

Recommended Reading

UPDATE: Read the story of how we found a memory leak in our blogging platform by comparing heapshots with Trace and Chrome's DevTools.

You have additional thoughts or insights on Node.js memory leaks? Share it in the comments.

Trace - Microservice Monitoring and Debugging

Trace by RisingStack - Distributed Tracing, Service map, Alerting and Performance Monitoring for Microservices

We are happy to announce Trace, a microservice monitoring and debugging tool that empowers you to get all the metrics you need when operating microservices. Trace both comes as a free, open source tool and as a hosted service.

Start monitoring your services

Why Trace for microservice monitoring?

Debugging and monitoring microservices can be really challenging:

  • no stack trace, hard to debug
  • easy to lose track of services when dealing with a lot
  • bottleneck detection

Key Features

Trace solves these problems by adding the ability to

  • do distributed stack traces,
  • topology view for your services,
  • and alerting for overwhelmed services,
  • third-party service monitoring (coming soon),
  • trace heterogeneous infrastructures with languages like Java, PHP or Ruby (coming soon).

How It Works

We want to monitor the traffic of our microservices. To be able to do this, we have to access each HTTP request-response pairs to get and set information. With wrapping the http core module's request function and the Server.prototype object, we can sniff all the information we need.

Trace is mostly based on the Google Dapper white paper - so we implemented the ServerReceive, ServerSend, ClientSend, ClientReceive events for monitoring the lifetime of a request.

trace events

In the example above, we want to catch the very first incoming request: SR (A): Server Receive. The http.Server will emit a request event, with http.IncomingMessage and a http.ServerResponse with the signature of

function (request, response) { }  

In the wrapper, we can record every information we want, like timing, the source, the requested path, or even the whole HTTP header for further investigation.

In Trace, one of the fundamental features is tracking the whole transaction in microservice architectures. Luckily we can do it, by setting a request-id header on the outgoing requests.

If our service has to call another service before it can send the response to its caller, we have to track this kind of request-response pairs, spans as well. A span always comes from http.request by calling an endpoint. By wrapping the http.request function, we can do the same as in the http.Server.prototype with one minor difference: here we want to pair the corresponding request and response, and assign a span-id to it.

However, the request-id will just pass through the span. In order to store the generated request-id, we use Continuation-Local Storage: after a request arrived and we generated the request-id, we store it in CLS, so when we try to call another service we can just get it back.

Create reporters

After you set up the collector by simply requiring it in your main file:

require([email protected]/trace');  

You can select a reporting method to process the collected data. You can use:

  • our Trace servers to see the transactions, your topology and services,
  • Logstash,
  • or any other custom reporter (see later).

You have to provide a trace.config.js config file, where you can declare the reporter. If you just want to see the collected data, you can use Logstash with the following config file:

/**
* Trace example config file for using with Logstash
*/

var reporters = require([email protected]/trace/lib/reporters');  
var config = {};

config.appName = 'Example Service Name';

config.reporter = reporters.logstash.create({  
  type: 'tcp',
  host: 'localhost',
  port: 12201
});

module.exports = config;  

If you start Logstash with the following command, every collected packet information will be displayed in the terminal:

logstash -e 'input { tcp { port => 12201 } } output { stdout {} }'  

Also, this approach can be really powerful when you want to tunnel these metrics into different systems, like ElasticSearch or just store them on S3.

Adding custom reporters

If you want to use the collector with your custom reporter, you have to provide your own implementation of the reporter API. The only required method is a send method with the collected data and a callback as parameters.

function CustomReporter (options) {  
  // init your reporter
}

CustomReporter.prototype.send = function (data, callback) {  
  // implement the data sending,
  // don't forget to call the callback after the data sending has ended
};

function create(options) {  
  return new CustomReporter(options);
}

module.exports.create = create;  

Use the Trace collector with Trace servers

If you want to enjoy all the benefits of our Trace service, you need to create an account first. After your API Key has been generated, you can use it in your config file:

/**
* Trace example config file for using with Trace servers
*/

var config = {};

config.appName = 'Example Service Name';

config.reporter = require([email protected]/trace/lib/reporters').trace.create({  
  apiKey: 'YOUR-APIKEY',
  appName: config.appName
});

module.exports = config;  

Adding Trace to your project

To use the Trace collector as a dependency of your project, use:

npm install --save @risingstack/trace

Currently, Trace supports [email protected], [email protected] and [email protected].

Trace-as-a-Service

If you don't want run your own infrastructure for storing and displaying microservice metrics we provide microservice monitoring as a service as well. This is Trace:

trace topology

trace view

Check out our tool!

Start monitoring your services