Introducing Distributed Tracing for Microservices Monitoring

RisingStack's services:

Sign up to our newsletter!

In this article:

At RisingStack, as an enterprise Node.js development and consulting company, we have been working tirelessly in the past two years to build durable and efficient microservices architectures for our clients and as being passionate advocates of this technology.

UPDATE: This article mentions Trace, RisingStack’s Node.js Monitoring platform several times. On 2017 October, Trace has been merged with Keymetrics’s APM solution. Click here to give it a try!

During this period, we had to face the cold fact that there aren’t proper tools able to support microservices architectures and the developers working with them. Monitoring, debugging and maintaining distributed systems is still extremely challenging.

We want to change this because doing microservices shouldn’t be so hard.

I am proud to announce that Trace – our microservices monitoring tool has entered the Open Beta stage and is available to use for free with Node.js services from now on.

Trace provides:

  • A Distributed Trace view for all of your transactions with error details
  • Service Map to see the communication between your microservices
  • Metrics on CPU, memory, RPM, response time, event loop and garbage collection
  • Alerting with Slack, Pagerduty, and Webhook integration

Trace makes application-level transparency available on a large microservices system with very low overhead. It will also help you to localize production issues faster to debug and monitor applications with ease.

You can use Trace in any IaaS or PaaS environment, including Amazon AWS, Heroku or DigitalOcean. Our solution currently supports Node.js only, but it will be available for other languages later as well. The open beta program lasts until 1 July.

Read along to get details on the individual features and on how Trace works.

Distributed Tracing

The most important feature of Trace is the transaction view. By using this tool, you can visualize every transaction going through your infrastructure on a timeline – in a very detailed way.

Distributed Tracing View Trace by Risingstack

By attaching a correlation ID to certain requests, Trace groups services taking part in a transaction and visualizes the exact data-flow on a simple tree-graph. Thanks to this you can see the distributed call stacks and the dependencies between your microservices and see where a request takes the most time.

This approach also lets you to localize ongoing issues and show them on the graph. Trace provides detailed feedback on what caused an error in a transaction and gives you enough data to start debugging your system instantly.

Distributed Tracing with Detailed Error Message

When a service causes an error in a distributed system, usually all of the services taking part in that transaction will throw an error, and it is hard to figure out which one really caused the trouble in the first place. From now on, you won’t need to dig through log files to find the answer.

With Trace, you can instantly see what was the path of a certain request, what services were involved, and what caused the error in your system.

The technology Trace uses is primarily based on Google’s Dapper whitepaper. Read the whole study to get the exact details.

Microservices Topology

Trace automatically generates a dynamic service map based on how your services communicate with each other or with databases and external APIs. In this view, we provide feedback on infrastructure health as well, so you will get informed when something begins to slow down or when a service starts to handle an increased amount of requests.

Distributed Tracing with Service Topology Map

The service topology view also allows you to immediately get a sense of how many requests your microservices handle in a given period and how big are their response times.

By getting this information you can see how your application looks like and understand the behavior of your microservices architecture.

Metrics and Alerting

Trace provides critical metrics data for each of your monitored services. Other than basics like CPU usage, memory usage, throughput and response time, our tool reports event loop and garbage collection metrics as well to make microservices development and operations easier.

Distributed Tracing with Metrics and Alerting

You can create alerts and get notified when a metric passes warning or error thresholds so you can act immediately. Trace will alert you via Slack, Pagerduty, Email or Webhook.

Give Microservices Monitoring a Try

Adding Trace to your services is possible with just a couple lines of code, and it can be installed and used in under two minutes.

We are curious on your feedback on Trace and on the concept of distributed transaction tracking, so don’t hesitate to express your opinion in the comment section.

Share this post

Twitter
Facebook
LinkedIn
Reddit