Trace Node.js Monitoring
Need help with Node.js?
Learn more
Need Node.js support?
Learn more
Péter Márton's Picture

Péter Márton

CTO at RisingStack, brewing beer with Node.js

CQRS Explained - Node.js at Scale

What is CQRS?

CQRS is an architectural pattern, where the acronym stands for Command Query Responsibility Segregation. We can talk about CQRS when the data read operations are separated from the data write operations, and they happen on a different interface.

In most of the CQRS systems, read and write operations use different data models, sometimes even different data stores. This kind of segregation makes it easier to scale, read and write operations and to control security - but adds extra complexity to your system.

Node.js at Scale is a collection of articles focusing on the needs of companies with bigger Node.js installations and advanced Node developers. Chapters:


The level of segregation can vary in CQRS systems:

  • single data stores and separated model for reading and updating data
  • separated data stores and separated model for reading and updating data

In the simplest data store separation, we can use read-only replicas to achieve segregation.

Why and when to use CQRS?

In a typical data management system, all CRUD (Create Read Update Delete) operations are executed on the same interface of the entities in a single data storage. Like creating, updating, querying and deleting table rows in an SQL database via the same model.

CQRS really shines compared to the traditional approach (using a single model) when you build complex data models to validate and fulfil your business logic when data manipulation happens. Read operations compared to update and write operations can be very different or much simpler - like accessing a subset of your data only.

Real world example

In our Node.js Monitoring Tool, we use CQRS to segregate saving and representing the data. For example, when you see a distributed tracing visualization on our UI, the data behind it arrived in smaller chunks from our customers application agents to our public collector API.

In the collector API, we only do a thin validation and send the data to a messaging queue for processing. On the other end of the queue, workers are consuming messages and resolving all the necessary dependencies via other services. These workers are also saving the transformed data to the database.

If any issue happens, we send back the message with exponential backoff and max limit to our messaging queue. Compared to this complex data writing flow, on the representation side of the flow, we only query a read-replica database and visualize the result to our customers.

Microservice with CQRS Trace by RisingStack data processing with CQRS

CQRS and Event Sourcing

I've seen many times that people are confusing these two concepts. Both of them are heavily used in event driven infrastructures like in an event driven microservices, but they mean very different things.

To read more about Event Sourcing with Examples, check out our previous Node.js at Scale article.

Download the whole building with Node.js series as a single pdf

Reporting database - Denormalizer

In some event driven systems, CQRS is implemented in a way that the system contains one or multiple Reporting databases.

A Reporting database is an entirely different read-only storage that models and persists the data in the best format for representing it. It's okay to store it in a denormalized format to optimize it for the client needs. In some cases, the reporting database contains only derived data, even from multiple data sources.

In a microservices architecture, we call a service the Denormalizer if it listens for some events and maintains a Reporting Database based on these. The client is reading the denormalized service's reporting database.

An example can be that the user profile service emits a user.edit event with { id: 1, name: 'John Doe', state: 'churn' } payload, the Denormalizer service listens to it but only stores the { name: 'John Doe' } in its Reporting Database, because the client is not interested in the internal state churn of the user.

It can be hard to keep a Reporting Database in sync. Usually, we can only aim to eventual consistency.

A CQRS Node.js Example Repo

For our CQRS with Denormalizer Node.js example visit our cqrs-example GitHub repository.

CQRS Example

Outro

CQRS is a powerful architectural pattern to segregate read and write operations and their interfaces, but it also adds extra complexity to your system. In most of the cases, you shouldn't use CQRS for the whole system, only for specific parts where the complexity and scalability make it necessary.

To read more about CQRS and Reporting databases, I recommend to check out these resources:

In the next chapter of the Node.js at Scale series we'll discuss Node.js Testing and Getting TDD Right. Read on! :)

I’m happy to answer your CQRS related questions in the comments section!

Event Sourcing with Examples - Node.js at Scale

Event Sourcing is a powerful architectural pattern to handle complex application states that may need to be rebuilt, re-played, audited or debugged.

From this article you can learn what Event Sourcing is, and when should you use it. We’ll also take a look at some Event sourcing examples with code snippets.

Node.js at Scale is a collection of articles focusing on the needs of companies with bigger Node.js installations and advanced Node developers. Chapters:

Event Sourcing

Event Sourcing is a software architecture pattern which makes it possible to reconstruct past states (latest state as well). It's achieved in a way that every state change gets stored as a sequence of events.

The State of your application is like a user's account balance or subscription at a particular time. This current state may only exist in memory.

Good examples for Event Sourcing are version control systems that stores current state as diffs. The current state is your latest source code, and events are your commits.

Why is Event Sourcing useful?

In our hypothetical example, you are working on an online money transfer site, where every customer has an account balance. Imagine that you just started working on a beautiful Monday morning when it suddenly turns out that you made a mistake and used a wrong currency exchange for the whole past week. In this case, every account which sent and received money in a last seven days are in a corrupt state.

With event sourcing, there’s no need to panic!

If your site uses event sourcing, you can revert the account balances to their previous uncorrupted state, fix the exchange rate and replay all the events until now. That's it, your job and reputation is saved!


Node.js Monitoring and Debugging from the Experts of RisingStack

See how deployments affect the performance of your production environment.
Learn more


Other use-cases

You can use events to audit or debug state changes in your system. They can also be useful for handling SaaS subscriptions. In a usual subscription based system, your users can buy a plan, upgrade it, downgrade it, pro-rate a current price, cancel a plan, apply a coupon, and so on... A good event log can be very useful to figure out what happened.

So with event sourcing you can:

  • Rebuild states completely
  • Replay states from a specific time
  • Reconstruct the state of a specific moment for temporary query

What is an Event?

An Event is something that happened in the past. An Event is not a snapshot of a state at a specific time; it's the action itself with all the information that's necessary to replay it.

Events should be a simple object which describes some action that occurred. They should be immutable and stored in an append-only way. Their immutable append-only nature makes them suitable to use as audit logs too.

This is what makes possible to undo and redo events or even replay them from a specific timestamp.

Be careful with External Systems!

As any software pattern, Event Sourcing can be challenging at some points as well.

The external systems that your application communicates with are usually not prepared for event sourcing, so you should be careful when you replay your events. I’m sure that you don’t wish to charge your customers twice or send all welcome emails again.

To solve this challenge, you should handle replays in your communication layers!

Command Sourcing

Command Sourcing is a different approach from Event Sourcing - make sure you don’t mix ‘em up by accident!

Event Sourcing:

  • Persist only changes in state
  • Replay can be side-effect free

Command Sourcing:

  • Persist Commands
  • Replay may trigger side-effects

Example for Event Sourcing

In this simple example, we will apply Event Sourcing for our accounts:

// current account states (how it looks in our DB now)
const accounts = {  
  account1: { balance: 100 },
  account2: { balance: 50 }
}
// past events (should be persisted somewhere, for example in a DB)
const events = [  
  { type: 'open', id: 'account1', balance: 150, time: 0 },
  { type: 'open', id: 'account2', balance: 0, time: 1 },
  { type: 'transfer', fromId: 'account1', toId: 'account2': amount: 50, time: 2 }
]

Let's rebuild the latest state from scratch, using our event log:

// complete rebuild
const accounts = events.reduce((accounts, event) => {  
  if (event.type === 'open') {
    accounts[event.id].balance = event.balance
  } else if (event.type === 'transfer') {
    accounts[event.fromId].balance -= event.amount
    accounts[event.toId].balance += event.amount
  }
  return accounts
}, {})

Undo the latest event:

// undo last event
const accounts = events.splice(-1).reduce((accounts, event) => {  
  if (event.type === 'open') {
    delete accounts[event.id]
  } else if (event.type === 'transfer') {
    accounts[event.fromId].balance += event.amount
    accounts[event.toId].balance -= event.amount
  }
  return accounts
}, {})

Query accounts state at a specific time:

// query specific time
function getAccountsAtTime (time) {  
  return events.reduce((accounts, event) => {
    if (time > event.time {
      return accounts
    }

    if (event.type === 'open') {
      accounts[event.id].balance = event.balance
    } else if (event.type === 'transfer') {
      accounts[event.fromId].balance -= event.amount
      accounts[event.toId].balance += event.amount
    }
    return accounts
  }, {})
}

const accounts = getAccountsAtTime(1)  

Download the whole building with Node.js series as a single pdf

Learning more..

For more detailed examples, you can check out our Event Sourcing Example repository.

For more general and deeper understanding of Event Sourcing I recommend to read these articles:

In the next part of the Node.js at Scale series, we’ll learn about Command Query Responsibility Segregation. Make sure you check back in a week!

If you have any questions on this topic, please let me know in the comments section below!

Graceful shutdown with Node.js and Kubernetes

This article helps you to understand what graceful shutdown is, what are the main benefits of it and how can you set up the graceful shutdown of a Kubernetes application. We’ll discuss how you can validate and benchmark this process, and what are the most common mistakes that you should avoid.

Graceful shutdown

We can speak about the graceful shutdown of our application, when all of the resources it used and all of the traffic and/or data processing what it handled are closed and released properly.

It means that no database connection remains open and no ongoing request fails because we stop our application.

"Graceful shutdown: when all of the resources & data processing are closed and released properly." via @RisingStack

Click To Tweet

Possible scenarios for a graceful web server shutdown:

  1. App gets notification to stop (received SIGTERM)
  2. App lets know the load balancer that it’s not ready for newer requests
  3. App served all the ongoing requests
  4. App releases all of the resources correctly: DB, queue, etc.
  5. App exits with "success" status code (process.exit())

This article goes deep with shutting down web servers properly, but you should also apply these techniques to your worker processes: it’s highly recommended to stop consuming queues for SIGTERM and finish the current task/job.

Why is it important?

If we don't stop our application correctly, we are wasting resources like DB connections and we may also break ongoing requests. An HTTP request doesn't recover automatically - if we fail to serve it, then we simply missed it.

"If we don't stop our app correctly, we're wasting resources & may also break ongoing requests." via @RisingStack

Click To Tweet

Graceful start

We should only start our application when all of the dependencies and database connections are ready to handle our traffic.

Possible scenarios for a graceful web server start:

  1. App starts (npm start)
  2. App opens DB connections
  3. App listens on port
  4. App tells the load balancer that it’s ready for requests

Graceful shutdown in a Node.js application

First of all, you need to listen for the SIGTERM signal and catch it:

process.on('SIGTERM', function onSigterm () {  
  console.info('Got SIGTERM. Graceful shutdown start', new Date().toISOString())
  // start graceul shutdown here
  shutdown()
})

After that, you can close your server, then close your resources and exit the process:

function shutdown() {  
  server.close(function onServerClosed (err) {
    if (err) {
      console.error(err)
      process.exit(1)
    }

    closeMyResources(function onResourcesClosed (err) {
      // error handling
      process.exit()
    })
  })
}

Sounds easy right? Maybe a little bit too easy.

What about the load balancer? How will it know that your app is not ready to receive further requests anymore? What about keep-alive connections? Will they keep the server open for a longer time? What if my server SIGKILL my app in the meantime?

Graceful shutdown with Kubernetes

If you’d like to learn a little bit about Kubernetes, you can read our Moving a Node.js app from PaaS to Kubernetes Tutorial. For now, let's just focus on the shutdown now.

Kubernetes comes with a resource called Service. Its job is to route traffic to your pods (~instances of your app). Kubernetes also comes with a thing called Deployment that describes how your applications should behave during exit, scale and deploy - and you can also define a health check here. We will combine these resources for the perfect graceful shutdown and handover during new deploys at high traffic.

We would like to see throughput charts like below with consistent rpm and no deployment side effects at all:

Graceful shutdown example: Throughput time in Trace by RisingStack Throughput metrics shown in Trace - no change at deploy

Ok, let's see how to solve this challenge.

Setting up graceful shutdown

In Kubernetes, for a proper graceful shutdown we need to add a readinessProbe to our application’s Deployment yaml and let the Service’s load balancer know during the shutdown that we will not serve more requests so it should stop sending them. We can close the server, tear down the DB connections and exit only after that.

How does it work?

Kubernetes graceful shutdown flowchart

  1. pod receives SIGTERM signal because Kubernetes wants to stop it - because of deploy, scale, etc.
  2. App (pod) starts to return 500 for GET /health to let readinessProbe (Service) know that it's not ready to receive more requests.
  3. Kubernetes readinessProbe checks GET /health and after (failureThreshold * periodSecond) it stops redirecting traffic to the app (because it continuously returns 500)
  4. App waits (failureThreshold * periodSecond) before it starts to shut down - to make sure that the Service is getting notified via readinessProbe fail
  5. App starts graceful shutdown
  6. App first closes server with live working DB connections
  7. App closes databases after the server is closed
  8. App exits process
  9. Kubernetes force kills the application after 30s (SIGKILL) if it's still running (in an optimal case it doesn't happen)

In our case, the Kubernetes livenessProbe won't kill the app before the graceful shutdown happens because it needs to wait (failureThreshold * periodSecond) to do it. This means that the livenessProve threshold should be larger than the readinessProbe threshold. This way the (graceful stop happens around 4s, while the force kill would happen 30s after SIGTERM).

Node.js Monitoring and Debugging from the Experts of RisingStack

Compare different application revisions using Trace!
Learn more

How to achieve it?

For this we need to do two things, first we need to let the readinessProbe know after SIGTERM that we are not ready anymore:

'use strict'

const db = require('./db')  
const promiseTimeout = require('./promiseTimeout')  
const state = { isShutdown: false }  
const TIMEOUT_IN_MILLIS = 900

process.on('SIGTERM', function onSigterm () {  
  state.isShutdown = true
})

function get (req, res) {  
  // SIGTERM already happened
  // app is not ready to serve more requests
  if (state.isShutdown) {
    res.writeHead(500)
    return res.end('not ok')
  }

  // something cheap but tests the required resources
  // timeout because we would like to log before livenessProbe KILLS the process
  promiseTimeout(db.ping(), TIMEOUT_IN_MILLIS)
    .then(() => {
      // success health
      res.writeHead(200)
      return res.end('ok')
    })
    .catch(() => {
      // broken health
      res.writeHead(500)
      return res.end('not ok')
    })
}

module.exports = {  
  get: get
}

The second thing is that we have to delay the teardown process - as a sane default you can use the time needed for two failed readinessProbe: failureThreshold: 2 * periodSeconds: 2 = 4s

process.on('SIGTERM', function onSigterm () {  
  console.info('Got SIGTERM. Graceful shutdown start', new Date().toISOString())

  // Wait a little bit to give enough time for Kubernetes readiness probe to fail 
  // (we are not ready to serve more traffic)
  // Don't worry livenessProbe won't kill it until (failureThreshold: 3) => 30s
  setTimeout(greacefulStop, READINESS_PROBE_DELAY)
})

You can find the full example here:
https://github.com/RisingStack/kubernetes-graceful-shutdown-example

How to validate it?

Let's test our graceful shutdown by sending high traffic to our pods and releasing a new version in the meantime (recreating all of the pods).

Test case

$ ab -n 100000 -c 20 http://localhost:myport

Other than this, you need to change an environment variable in the Deployment to re-create all pods during the ab benchmarking.

AB output

Document Path:          /  
Document Length:        3 bytes

Concurrency Level:      20  
Time taken for tests:   172.476 seconds  
Complete requests:      100000  
Failed requests:        0  
Total transferred:      7800000 bytes  
HTML transferred:       300000 bytes  
Requests per second:    579.79 [#/sec] (mean)  
Time per request:       34.495 [ms] (mean)  
Time per request:       1.725 [ms] (mean, across all concurrent requests)  
Transfer rate:          44.16 [Kbytes/sec] received  

Application log output

Got SIGTERM. Graceful shutdown start 2016-10-16T18:54:59.208Z  
Request after sigterm: / 2016-10-16T18:54:59.217Z  
Request after sigterm: / 2016-10-16T18:54:59.261Z  
...
Request after sigterm: / 2016-10-16T18:55:00.064Z  
Request after sigterm: /health?type=readiness 2016-10-16T18:55:00.820Z  
HEALTH: NOT OK  
Request after sigterm: /health?type=readiness 2016-10-16T18:55:02.784Z  
HEALTH: NOT OK  
Request after sigterm: /health?type=liveness 2016-10-16T18:55:04.781Z  
HEALTH: NOT OK  
Request after sigterm: /health?type=readiness 2016-10-16T18:55:04.800Z  
HEALTH: NOT OK  
Server is shutting down... 2016-10-16T18:55:05.210Z  
Successful graceful shutdown 2016-10-16T18:55:05.212Z  

Benchmark result

Success!

Zero failed requests: you can see in the app log that the Service stopped sending traffic to the pod before we disconnected from the DB and killed the app.

Common gotchas

The following mistakes can still prevent your app from doing a proper graceful shutdown:

Keep-alive connections

Kubernetes doesn't handover keep-alive connections properly. :/

This means that request from agents with a keep-alive header will still be routed to the pod.

It tricked me first when I benchmarked with autocannon or Google Chrome (they use keep-alive connections).

Keep-alive connections prevent closing your server in time. To force the exit of a process, you can use the server-destroy module. Once it ran, you can be sure that all the ongoing requests are served. Alternatively you can adda timeout logic to your server.close(cb).

Docker signaling

It’s quite possible that your application doesn't receive the signals correctly in a dockerized application.

For example in our Alpine image: CMD ["node", "src"] works, CMD ["npm", "start"] doesn't. It simply doesn't pass the SIGTERM to the node process. The issue is probably related to this PR: https://github.com/npm/npm/pull/10868

An alternative you can use is dumb-init for fixing broken Docker signaling.

Takeaway

Always be sure that your application stops correctly: It releases all of the resources and helps to hand over the traffic to the new version of your app.

Check out our example repository with Node.js and Kubernetes:
https://github.com/RisingStack/kubernetes-graceful-shutdown-example

"An app stops correctly if it releases all resources & hands over the traffic to your new app." via @RisingStack

Click To Tweet

If you have any questions or thoughts about this topic, find me in the comment section below!


Moving a Node.js app from PaaS to Kubernetes Tutorial

From this Kubernetes tutorial, you can learn how to move a Node.js app from a PaaS provider while achieving lower response times, improving security and reducing costs.


Before we jump into the story of why and how we migrated our services to Kubernetes, it's important to mention that there is nothing wrong with using a PaaS. PaaS is perfect to start building a new product, and it can also turn out to be a good solution as an application advances - it always depends on your requirements and resources.

PaaS

Trace by RisingStack, our Node.js monitoring solution was running on one of the biggest PaaS providers for more than half a year. We have chosen a PaaS over other solutions because we wanted to focus more on the product instead of the infrastructure.

Our requirements were simple; we wanted to have:

  • fast deploys,
  • simple scaling,
  • zero-downtime deployments,
  • rollback capabilities,
  • environment variable management,
  • various Node.js versions,
  • and "zero" DevOps.

What we didn't want to have, but got as a side effect of using PaaS:

  • big network latencies between services,
  • lack of VPC,
  • response time peaks because of the multitenancy,
  • larger bills (pay for every single process, no matter how small it is: clock, internal API, etc.).

Trace is developed as a group of micro-services, you can imagine how quickly the network latency and billing started to hurt us.

Kubernetes tutorial

From our PaaS experience, we knew that we are looking for a solution that needs very few DevOps effort but provides a similar flow for our developers. We didn't want to lose any of the advantages I mentioned above - however, we wanted to fix the outstanding issues.

We were looking for an infrastructure that is more configuration-based, and anyone from the team can modify it.

Kubernetes with its’ configuration-focused, container-based and micro-services friendly nature convinced us.

"Kubernetes convinced us with its configuration-focused, microservices friendly nature" via @RisingStack #kubernetes

Click To Tweet

Let me show you what I mean under these "buzzwords" through the upcoming sections.

Node.js Monitoring and Debugging from the Experts of RisingStack

Build performant applications using Trace
Learn more

What is Kubernetes?

Kubernetes is an open-source system for automating deployments, scaling, and management of containerized applications - kubernetes.io

I don't want to give a very deep intro about the Kubernetes elements here, but you need to know the basic ones for the upcoming parts of this post.

My definitions won't be 100% correct, but you can think of it as a PaaS to Kubernetes dictionary:

  • pod: your running containerized application with environment variables, disk, etc. together, pods born and die quickly, like at deploys,

    • in PaaS: ~currently running app
  • deployment: configuration of your application that describes what state do you need (CPU, memory, env. vars, docker image version, disks, number of running instances, deploy strategy, etc.):

    • in PaaS: ~app settings
  • secret: you can separate your credentials from environment variables,

    • in PaaS: not exist, like a shared separated secret environment variable, for DB credentials etc.
  • service: exposes your running pods by label(s) to other apps or to the outside world on the desired IP and port

    • in PaaS: built-in non-configurable load balancer

How to set up a running Kubernetes cluster?

You have several options here. The easiest one is to create a Container Engine in Google Cloud, which is a hosted Kubernetes. It's also well integrated with other Google Cloud components, like load balancers and disks.

You should also know, that Kubernetes can run anywhere like AWS, DigitalOcean, Azure etc. For more information check out the CoreOS Kubernetes tools.

Running the application

First, we have to prepare our application to work well with Kubernetes in a Docker environment.

If you are looking for a tutorial on how to start an app from scratch with Kubernetes check out their 101 tutorial.

Node.js app in Docker container

Kubernetes is Docker-based, so first we need to containerize our application. If you are not sure how to do that, check out our previous post: Dockerizing Your Node.js Application

If you are a private NPM user, you will also find this one helpful: Using the Private NPM Registry from Docker

"Procfile" in Kubernetes

We create one Docker image for every application (Git repository). If the repository contains multiple processes like: server, worker and clock we choose between them with an environment variable. Maybe you find it strange, but we don't want to build and push multiple Docker images from the very same source code, it would slow down our CI.

Environments, rollback, and service-discovery

Staging, production

During our PaaS period we named our services like trace-foo and trace-foo-staging, the only difference between the staging and production application was the name prefix and the different environment variables. In Kubernetes it's possible to define namespaces. Each namespace is totally independent from each other and doesn't share any resources like secrets, config, etc.

$ kubectl create namespace production
$ kubectl create namespace staging

Application versions

In a containerized infrastructure, each application version should be a different container image with a tag. We use the Git short hash as a Docker image tag.

foo:b37d759  
foo:f53a7cb  

To deploy a new version of your application, you only need to change the image tag in your application's deployment config, Kubernetes will do the rest.

Kubernetes Tutorial: The Deploy Flow (Deploy flow)

Any change in your deployment file is versioned and you can rollback to them anytime.

$ kubectl rollout history deployment/foo
deployments "foo":  
REVISION    CHANGE-CAUSE  
1           kubectl set image deployment/foo foo=foo:b37d759  
2           kubectl set image deployment/foo foo=foo:f53a7cb  

During our deploy process, we only replace Docker images which are quite fast - they only require a couple of seconds.

Service discovery

Kubernetes has a built-in simple service discovery solution: The created services expose their hostname and port as an environment variable for each pod.

const fooServiceUrl = `http://${process.env.FOO_SERVICE_HOST}:${process.env.FOO_SERVICE_PORT}`  

If you don't need advanced discovery, you can just start using it, instead of copying your service URLs to each other's environment variables. Kind of cool, isn't it?

Production ready application

The reallly challenging part of jumping into a new technology is to know what you need to be production ready. In the following section we will check what you should consider to set up in your app.

Zero downtime deployment and failover

Kubernetes can update your application in a way that it always keeps some pods running and deploy your changes in smaller steps - instead of stopping and starting all of them at the same time.

It’s not just helpful to prevent zero downtime deploys; it also avoids killing your whole application when you misconfigure something. Your mistake stops escalating to all of the running pods after Kubernetes detects that your new pods are unhealthy.

Kubernetes supports several strategies to deploy your applications. You can check them in the Deployment strategy documentation.

Graceful stop

It’s not mainly related to Kubernetes, but it’s impossible to have a good application lifecycle without starting and stopping your process in a proper way.

Start server

const server = MyServer()  
Promise.all([  
   db1.connect()
   db2.connect()
])
  .then() => server.listen(3000))

Gracefull server stop

process.on('SIGTERM', () => {  
  server.close()
    .then() => Promise.all([
      db1.disconnect()
      db2.disconnect()
    ])
   .then(() => process.exit(0))
   .catch((err) => process.exit(-1))
})

Liveness probe (health check)

In Kubernetes, you should define health check (liveness probe) for your application. With this, Kubernetes will be able to detect when your application needs to be restarted.

Web server health check

You have multiple options to check your applications health, but I think the easiest one is to create a GET /healthz endpoint end check your applications logic / DB connections there. It’s important to mention that every application is different, only you can know what checks are necessary to make sure it’s working.

app.get('/healthz', function (req, res, next) {  
  // check my health
  // -> return next(new Error('DB is unreachable'))
  res.sendStatus(200)
})
livenessProbe:  
    httpGet:
      # Path to probe; should be cheap, but representative of typical behavior
      path: /healthz
      port: 3000
    initialDelaySeconds: 30
    timeoutSeconds: 1
Worker health check

For our workers we also set up a very small HTTP server with the same /healthz endpoint which checks different criteria with the same liveness probe. We do it to have company-wide consistent health check endpoints.

Readiness probe

The readiness probe is similar to the liveness probe (health check), but it makes sense only for web servers. It tells the Kubernetes service (~load balancer) that the traffic can be redirected to the specific pod.

It is essential to avoid any service disruption during deploys and other issues.

readinessProbe:  
    httpGet:
      # You can use the /healthz or something else
      path: /healthz
      port: 3000
    initialDelaySeconds: 30
    timeoutSeconds: 1

Logging

For logging, you can choose from different approaches, like adding side containers to your application which collects your logs and sends them to custom logging solutions, or you can go with the built-in Google Cloud one. We selected the built-in one.

To be able to parse the built in log levels (severity) on Google Cloud, you need to log in the specific format. You can achieve this easily with the winston-gke module.

// setup logger
cons logger = require(‘winston’)  
cons winstonGke = require(‘winston-gke’)  
logger.remove(logger.transports.Console)  
winstonGke(logger, config.logger.level)

// usage
logger.info(‘I\’m a potato’, { foo: ‘bar’ })  
logger.warning(‘So warning’)  
logger.error(‘Such error’)  
logger.debug(‘My debug log)

If you log in the specific format, Kubernetes will automatically merge your log messages with the container, deployment, etc. meta information and Google Cloud will show it in the right format.

Your applications first log message has to be in the right format, otherwise it won’t start to parse it correctly.

To achieve this, we turned our npm start to silent, npm start -s in a Dockerfile: CMD ["npm", "start", "-s"]

Monitoring

We check our applications with Trace which is optimized from scratch to monitor and visualize micro-service architectures. The service map view of Trace helped us a lot during the migration to understand which application communicates with which one and what are the database and external dependencies.

Kubernetes Tutorial: Services in our infrastructure (Services in our infrastructure)

Since Trace is environment independent, we didn't have to change anything in our codebase, and we could use it to validate the migration and our expectations about the positive performance changes.

Kubernetes tutorial : Response times after the migration (Stable and fast response times)

Example

Check out our all together example repository for Node.js with Kubernetes and CircleCI:
https://github.com/RisingStack/kubernetes-nodejs-example

Tooling

Continuous deployment with CI

It's possible to update your Kubernetes deployment with a JSON path, or update only the image tag. After you have a working kubectl on your CI machine, you only need to run this command:

$ kubectl --namespace=staging set image deployment/foo foo=foo:GIT_SHORT_SHA

Debugging

In Kubernetes it's possible to run a shell inside any container, it's this easy:

$ kubectl get pod

NAME           READY     STATUS    RESTARTS   AGE  
foo-37kj5   1/1       Running   0          2d

$ kubectl exec foo-37kj5 -i -t -- sh
# whoami       
root  

Another useful thing is to check the pod events with:

$ kubectl describe pod foo-37kj5

You can also get the log message of any pod with the:

$ kubectl log foo-37kj5

Code piping

At our PaaS provider, we liked code piping between staging and production infrastructure. In Kubernetes we missed this, so we built our own solution.

It's a simple npm library which reads the current image tag from staging and sets it on the production deployment config.

Because the Docker container is the very same, only the environment variable changes.

SSL termination (https)

Kubernetes services are not exposed as https by default, but you can easily change this. To do so, read how to expose your applications with TLS in Kubernetes.

Conclusion

To summarize our experience with Kubernetes: we are very satisfied with it.

"Kubernetes helped us to improve our response times and failover + reduced our costs" via @RisingStack #kubernetes

Click To Tweet

We improved our applications response time in our micro-service architecture. We managed to raise security to the next level with the private network (VPC) between apps.

Also, we reduced our costs and improved the failover with the built-in rolling update strategy and liveness, readiness probes.

If you are in a state when you need to think about your infrastructure's future, you should definitely take Kubernetes into consideration!

If you have questions about migrating to Kubernetes from a PaaS, feel free to post them in the comment section.

React.js Best Practices for 2016

2015 was the year of React with tons of new releases and developer conferences dedicated to the topic all over the world. For a detailed list of the most important milestones of last year, check out our React in 2015 wrap up.

The most interesting question for 2016: How should we write an application and what are the recommended libraries?

As a developer working for a long time with React.js I have my own answers and best practices, but it's possible that you won’t agree on everything with me. I’m interested in your ideas and opinions: please leave a comment so we can discuss them.

React.js logo - Best Practices for 2016

If you are just getting started with React.js, check out our React.js tutorial, or the React howto by Pete Hunt.

Dealing with data

Handling data in a React.js application is super easy, but challenging at the same time.
It happens because you can pass properties to a React component in a lot of ways to build a rendering tree from it; however it's not always obvious how you should update your view.

2015 started with the releases of different Flux libraries and continued with more functional and reactive solutions.

Node.js Monitoring and Debugging from the Experts of RisingStack

Build performant backend for React appliactions using Trace
Learn more

Let's see where we are now:

Flux

According to our experience, Flux is often overused (meaning that people use it even if they don't even need it).

Flux provides a clean way to store and update your application's state and trigger rendering when it's needed.

Flux can be useful for the app's global states like: managing logged in user, the state of a router or active account but it can turn quickly into pain if you start to manage your temporary or local data with it.

We don’t recommend using Flux for managing route-related data like /items/:itemId. Instead, just fetch it and store it in your component's state. In this case, it will be destroyed when your component goes away.

If you need more info about Flux, The Evolution of Flux Frameworks is a great read.

Use redux

Redux is a predictable state container for JavaScript apps.

If you think you need Flux or a similar solution you should check out redux and Dan Abramov's Getting started with redux course to quickly boost your development skills.

Redux evolves the ideas of Flux but avoids its complexity by taking cues from Elm.

Keep your state flat

API's often return nested resources. It can be hard to deal with them in a Flux or Redux-based architecture. We recommend to flatten them with a library like normalizr and keep your state as flat as possible.

Hint for pros:

const data = normalize(response, arrayOf(schema.user))

state = _.merge(state, data.entities)  

(we use isomorphic-fetch to communicate with our APIs)

Use immutable states

Shared mutable state is the root of all evil - Pete Hunt, React.js Conf 2015

Immutable logo for React.js Best Practices 2016

Immutable object is an object whose state cannot be modified after it is created.

Immutable objects can save us all a headache and improve the rendering performance with their reference-level equality checks. Like in the shouldComponentUpdate:

shouldComponentUpdate(nexProps) {  
 // instead of object deep comparsion
 return this.props.immutableFoo !== nexProps.immutableFoo
}

How to achieve immutability in JavaScript?
The hard way is to be careful and write code like the example below, which you should always check in your unit tests with deep-freeze-node (freeze before the mutation and verify the result after it).

return {  
  ...state,
  foo
}

return arr1.concat(arr2)  

Believe me, these were the pretty obvious examples.

The less complicated way but also less natural one is to use Immutable.js.

import { fromJS } from 'immutable'

const state = fromJS({ bar: 'biz' })  
const newState = foo.set('bar', 'baz')  

Immutable.js is fast, and the idea behind it is beautiful. I recommend watching the Immutable Data and React video by Lee Byron even if you don't want to use it. It will give deep insight to understand how it works.

Observables and reactive solutions

If you don't like Flux/Redux or just want to be more reactive, don't be disappointed! There are other solutions to deal with your data. Here is a short list of libraries what you are probably looking for:

  • cycle.js ("A functional and reactive JavaScript framework for cleaner code")
  • rx-flux ("The Flux architecture with RxJS")
  • redux-rx ("RxJS utilities for Redux.")
  • mobservable ("Observable data. Reactive functions. Simple code.")

Routing

Almost every client side application has some routing. If you are using React.js in a browser, you will reach the point when you should pick a library.

Our chosen one is the react-router by the excellent rackt community. Rackt always ships quality resources for React.js lovers.

To integrate react-router check out their documentation, but what's more important here: if you use Flux/Redux we recommend to keep your router's state in sync with your store/global state.

Synchronized router states will help you to control router behaviors by Flux/Redux actions and read router states and parameters in your components.

Redux users can simply do it with the redux-simple-router library.

Code splitting, lazy loading

Only a few of webpack users know that it's possible to split your application’s code to separate the bundler's output to multiple JavaScript chunks:

require.ensure([], () => {  
  const Profile = require('./Profile.js')
  this.setState({
    currentComponent: Profile
  })
})

It can be extremely useful in large applications because the user's browser doesn't have to download rarely used codes like the profile page after every deploy.

Having more chunks will cause more HTTP requests - but that’s not a problem with HTTP/2 multiplexed.

Combining with chunk hashing you can also optimize your cache hit ratio after code changes.

The next version of react-router will help a lot in code splitting.

For the future of react-router check out this blog post by Ryan Florence: Welcome to Future of Web Application Delivery.

Components

A lot of people are complaining about JSX. First of all, you should know that it’s optional in React.

At the end of the day, it will be compiled to JavaScript with Babel. You can write JavaScript instead of JSX, but it feels more natural to use JSX while you are working with HTML.
Especially because even less technical people could still understand and modify the required parts.

JSX is a JavaScript syntax extension that looks similar to XML. You can use a simple JSX syntactic transform with React. - JSX in depth

If you want to read more about JSX check out the JSX Looks Like An Abomination - But it’s Good for You article.

Use Classes

React works well with ES2015 classes.

class HelloMessage extends React.Component {  
  render() {
    return <div>Hello {this.props.name}</div>
  }
}

We prefer higher order components over mixins so for us leaving createClass was more like a syntactical question rather than a technical one. We believe there is nothing wrong with using createClass over React.Component and vice-versa.

PropType

If you still don't check your properties, you should start 2016 with fixing this. It can save hours for you, believe me.

MyComponent.propTypes = {  
  isLoading: PropTypes.bool.isRequired,
  items: ImmutablePropTypes.listOf(
    ImmutablePropTypes.contains({
      name: PropTypes.string.isRequired,
    })
  ).isRequired
}

Yes, it's possible to validate Immutable.js properties as well with react-immutable-proptypes.

Higher order components

Now that mixins are dead and not supported in ES6 Class components we should look for a different approach.

What is a higher order component?

PassData({ foo: 'bar' })(MyComponent)  

Basically, you compose a new component from your original one and extend its behaviour. You can use it in various situations like authentication: requireAuth({ role: 'admin' })(MyComponent) (check for a user in higher component and redirect if the user is not logged in) or connecting your component with Flux/Redux store.

At RisingStack, we also like to separate data fetching and controller-like logic to higher order components and keep our views as simple as possible.

Testing

Testing with good test coverage must be an important part of your development cycle. Luckily, the React.js community came up with excellent libraries to help us achieve this.

Component testing

One of our favorite library for component testing is enzyme by AirBnb. With it's shallow rendering feature you can test logic and rendering output of your components, which is pretty amazing. It still cannot replace your selenium tests, but you can step up to a new level of frontend testing with it.

it('simulates click events', () => {  
  const onButtonClick = sinon.spy()
  const wrapper = shallow(
    <Foo onButtonClick={onButtonClick} />
  )
  wrapper.find('button').simulate('click')
  expect(onButtonClick.calledOnce).to.be.true
})

Looks neat, isn't it?

Do you use chai as assertion library? You will like chai-enyzime!

Redux testing

Testing a reducer should be easy, it responds to the incoming actions and turns the previous state to a new one:

it('should set token', () => {  
  const nextState = reducer(undefined, {
    type: USER_SET_TOKEN,
    token: 'my-token'
  })

  // immutable.js state output
  expect(nextState.toJS()).to.be.eql({
    token: 'my-token'
  })
})

Testing actions is simple until you start to use async ones. For testing async redux actions we recommend to check out redux-mock-store, it can help a lot.

it('should dispatch action', (done) => {  
  const getState = {}
  const action = { type: 'ADD_TODO' }
  const expectedActions = [action]

  const store = mockStore(getState, expectedActions, done)
  store.dispatch(action)
})

For deeper redux testing visit the official documentation.

Use npm

However React.js works well without code bundling we recommend using Webpack or Browserify to have the power of npm. Npm is full of quality React.js packages, and it can help to manage your dependencies in a nice way.

(Please don’t forget to reuse your own components, it’s an excellent way to optimize your code.)

Bundle size

This question is not React-related but because most people bundle their React application I think it’s important to mention it here.

While you are bundling your source code, always be aware of your bundle’s file size. To keep it at the minimum you should consider how you require/import your dependencies.

Check the following code snippet, the two different way can make a huge difference in the output:

import { concat, sortBy, map, sample } from 'lodash'

// vs.
import concat from 'lodash/concat';  
import sortBy from 'lodash/sortBy';  
import map from 'lodash/map';  
import sample from 'lodash/sample';  

Check out the Reduce Your bundle.js File Size By Doing This One Thing for more details.

We also like to split our code to least vendors.js and app.js because vendors updates less frequently than our code base.
With hashing the output file names (chunk hash in WebPack) and caching them for the long term, we can dramatically reduce the size of the code what needs to be downloaded by returning visitors on the site. Combining it with lazy loading you can imagine how optimal can it be.

If you are new to Webpack, check out this excellent React webpack cookbook.

Component-level hot reload

If you ever wrote a single page application with livereload, probably you know how annoying it is when you are working on something stateful, and the whole page just reloads while you hit a save in your editor. You have to click through the application again, and you will go crazy repeating this a lot.

With React, it's possible to reload a component while keeping its states - boom, no more pain!

To setup hot reload check out the react-transform-boilerplate.

Use ES2015

I mentioned that we use JSX in our React.js components what we transpile with Babel.js.

Babel logo in React.js Best Practices 2016

Babel can do much more and also makes possible to write ES6/ES2015 code for browsers today. At RisingStack, we use ES2015 features on both server and client side which are available in the latest LTS Node.js version.

Linters

Maybe you already use a style guide for your JavaScript code but did you know that there are style guides for React as well? We highly recommend to pick one and start following it.

At RisingStack, we also enforce our linters to run on the CI system and for git push as well. Check out pre-push or pre-commit.

We use JavaScript Standard Style for JavaScript with eslint-plugin-react to lint our React.js code.

(That's right, we do not use semicolons anymore.)

GraphQL and Relay

GraphQL and Relay are relatively new technologies. At RisingStack, we don’t use it in production for now, just keeping our eyes open.

We wrote a library called graffiti which is a MongoDB ORM for Relay and makes it possible to create a GraphQL server from your existing mongoose models.
If you would like to learn these new technologies we recommend to check it out and play with it.

Takeaway from these React.js Best Practices

Some of the highlighted techniques and libraries are not React.js related at all - always keep your eyes open and check what others in the community do. The React community is inspired a lot by the Elm architecture in 2015.

If you know about other essential React.js tools that people should use in 2016, let us know in the comments!

React in 2015 - Retrospection

In details

2015 was the year of React

React had an amazing year with tons of new releases and developer conferences thanks to the contribution of the open-source community and enterprise adopters. As a result, React is used by companies like Facebook, Yahoo, Imgur, Mozilla, Airbnb, Netflix, Sberbank and much more. For a more detailed list check out this collection: Sites Using React

If you are not familiar with React, check out our in-depth tutorial series: The React.js Way

2015 January

2015 February

2015 March

2015 May

2015 June

2015 July

2015 August

2015 October

2015 November

2015 December

Read more about React

If you can't get enough of React head over to our related articles page.

Do you miss anything from the timeline? Let us know in the comments.

GraphQL Overview - Getting Started with GraphQL and Node.js

We've just released Graffiti: it transforms your existing models into a GraphQL schema. Here is how.

ReactEurope happened last week in the beautiful city of Paris. As it was expected and long-awaited, Facebook released their implementation of the GraphQL draft.

What is GraphQL?

GraphQL is a query language created by Facebook in 2012 which provides a common interface between the client and the server for data fetching and manipulations.

The client asks for various data from the GraphQL server via queries. The response format is described in the query and defined by the client instead of the server: they are called client‐specified queries.
The structure of the data is not hardcoded as in traditional REST APIs - this makes retrieving data from the server more efficient for the client.

For example, the client can ask for linked resources without defining new API endpoints. With the following GraphQL query, we can ask for the user specific fields and the linked friends resource as well.

{
  user(id: 1) {
    name
    age
    friends {
      name
    }
  }
}

In a resource based REST API it would look something like:

GET /users/1 and GET /users/1/friends  


or

GET /users/1?include=friends.name  

GraphQL overview

It's important to mention that GraphQL is not language specific, it's just a specification between the client and the server. Any client should be able to communicate with any server if they speak the common language: GraphQL.

Key concepts of the GraphQL query language are:

  • Hierarchical
  • Product‐centric
  • Strong‐typing
  • Client‐specified queries
  • Introspective

I would like to highlight strong-typing here which means that GraphQL introduces an application level type system. It's a contract between the client and server which means that your server in the background may use different internal types. The only thing here what matters is that the GraphQL server must be able to receive GraphQL queries, decide if that it is syntactically correct and provide the described data for that.

For more details on the concept of GraphQL check out the GraphQL specification.

Where is it useful?

GraphQL helps where your client needs a flexible response format to avoid extra queries and/or massive data transformation with the overhead of keeping them in sync.

Using a GraphQL server makes it very easy for a client side developer to change the response format without any change on the backend.

With GraphQL, you can describe the required data in a more natural way. It can speed up development, because in application structures like top-down rendering in React, the required data is more similar to your component structure.

Check out our previous query and how similar it is to the following component structure:

<App>  
  <User>
    <Friend/>
    <Friend/>
  </User>
</App>  

Differences with REST

REST APIs are resource based. Basically what you do is that you address your resources like GET /users/1/friends, which is a unique path for them. It tells you very well that you are looking for the friends of the user with id=1.

The advantages of REST APIs are that they are cacheable, and their behaviour is obvious.

The disadvantage is that it's hard to specify and implement advanced requests with includes, excludes and especially with linked resources. I think you have already seen requests like:
GET /users/1/friends/1/dogs/1?include=user.name,dog.age

This is exactly the problem what GraphQL wants to solve. If you have types of user and dog and their relations are defined, you can write any kind of query to get your data.

You will have the following queries out of the box:

  • get name of the user with id=1
{
 user(id: 1) {
   name
 }
}
  • get names for friends of the user with id=1
{
 user(id: 1) {
   friends {
     name
   }
 }
}
  • get age and friends of the user with id=1
{
 user(id: 1) {
   age
   friends {
     name
   }
 }
}
  • get names of the dogs of the friends of the user with id=1 :)
{
 user(id: 1) {
   friends {
     dogs {
       name
     }
   }
 }
}

Simple right? Implement once, re-use it as much as possible.

GraphQL queries

You can do two type of queries with GraphQL:

  • when you fetch (get) data from your server and the
  • when you manipulate (create, update, delete) your data

GraphQL queries are like JSON objects without properties:

// a json object
{
  "user": "name"
}
// a graphql query
{
  user {
    name
  }
}

I already showed some queries for getting data from the GraphQL server, but what else can we do?

We can write named queries:

{
  findUser(id: 1)
}

you can pass parameters to your query:

query findUser($userId: String!) {  
  findUser(id: $userId) {
    name
  }
}

With the combination of these building blocks and with the static typing we can write powerful client specified queries. So far so good, but how can we modify our data? Let's see the next chapter for mutations.

GraphQL mutations

With GraphQL mutation you can manipulate data:

mutation updateUser($userId: String! $name: String!) {  
  updateUser(id: $userId name: $name) {
    name
  }
}

With this, you can manipulate your data and retrieve the response in the required format at the same time - pretty powerful, isn't it?

The recommendation here is to name your mutations meaningful to avoid future inconsistencies. I recommend to use names like: createUser, updateUser or removeUser.

GraphQL through HTTP

You can send GraphQL queries through HTTP:

  • GET for querying
  • POST for mutation

Caching GraphQL requests

Caching can work the same way with GET queries, as you would do it with a classic HTTP API. The only exception is when having a very complex query - in that case you may want to send that as a POST and use caching on a database/intermediary level.

Other Transport layers

HTTP is just one option - GraphQL is transport independent, so you can use it with websockets or even mqtt.

GraphQL example with Node.js server

The Facebook engineering team open-sourced a GraphQL reference implementation in JavaScript. I recommend checking their implementation to have a better picture about the possibilities of GraphQL.

They started with the JavaScript implementation and also published an npm library to make GraphQL generally available. We can start playing with it and build a simple GraphQL Node.js server with MongoDB. Are you in? ;)

The GraphQL JS library provides a resolve function for the schemas:

user: {  
  type: userType,
  args: {
    id: {
      name: 'id',
      type: new GraphQLNonNull(GraphQLString)
    }
  },
  resolve: (root, {id}) => {
    return User.findById(id);
  }
}

The only thing what we have to do here is to provide the data for the specific resolve functions. These functions are called by GraphQL JS in parallel.

We can generate a projection for our MongoDB query in the following way:

function getProjection (fieldASTs) {  
  return fieldASTs.selectionSet.selections.reduce((projections, selection) => {
    projections[selection.name.value] = 1;

    return projections;
  }, {});
}

and use it like:

resolve: (root, {id}, source, fieldASTs) => {  
  var projections = getProjection(fieldASTs);
  return User.findById(id, projections);
}

This helps optimising the amount of the fetched data from our database.

Check out the Node.js implementation with MongoDB for more details: https://github.com/RisingStack/graphql-server

Take a look at Graffiti: it transforms your existing models into a GraphQL schema.

The React.js Way: Flux Architecture with Immutable.js

This article is the second part of the "The React.js Way" blog series. If you are not familiar with the basics, I strongly recommend you to read the first article: The React.js Way: Getting Started Tutorial.

In the previous article, we discussed the concept of the virtual DOM and how to think in the component way. Now it's time to combine them into an application and figure out how these components should communicate with each other.

Components as functions

The really cool thing in a single component is that you can think about it like a function in JavaScript. When you call a function with parameters, it returns a value. Something similar happens with a React.js component: you pass properties, and it returns with the rendered DOM. If you pass different data, you will get different responses. This makes them extremely reusable and handy to combine them into an application. This idea comes from functional programming that is not in the scope of this article. If you are interested, I highly recommend reading Mikael Brevik's Functional UI and Components as Higher Order Functions blog post to have a deeper understanding on the topic.

Top-down rendering

Ok it's cool, we can combine our components easily to form an app, but it doesn't make any sense without data. We discussed last time that with React.js your app's structure is a hierarchy that has a root node where you can pass the data as a parameter, and see how your app responds to it through the components. You pass the data at the top, and it goes down from component to component: this is called top-down rendering.

React.js component hierarchy

It's great that we pass the data at the top, and it goes down via component's properties, but how can we notify the component at a higher level in the hierarchy if something should change? For example, when the user pressed a button?
We need something that stores the actual state of our application, something that we can notify if the state should change. The new state should be passed to the root node, and the top-down rendering should be kicked in again to generate (re-render) the new output (DOM) of our application. This is where Flux comes into the picture.

Flux architecture

You may have already heard about Flux architecture and the concept of it.
I’m not going to give a very detailed overview about Flux in this article; I've already done it earlier in the Flux inspired libraries with React post.

Application architecture for building user interfaces - Facebook flux

A quick reminder: Flux is a unidirectional data flow concept where you have a Store which contains the actual state of your application as pure data. It can emit events when it's changed and let your application’s components know what should be re-rendered. It also has a Dispatcher which is a centralized hub and creates a bridge between your app and the Store. It has actions that you can call from your app, and it emits events for the Store. The Store is subscribed for those events and change its internal state when it's necessary. Easy, right? ;)

Flux arhitecture

PureRenderMixin

Where are we with our current application? We have a data store that contains the actual state. We can communicate with this store and pass data to our app that responds for the incoming state with the rendered DOM. It's really cool, but sounds like lot's of rendering: (it is). Remember component hierarchy and top-down rendering - everything responds to the new data.

I mentioned earlier that virtual DOM optimizes the DOM manipulations nicely, but it doesn't mean that we shouldn't help it and minimize its workload. For this, we have to tell the component that it should be re-rendered for the incoming properties or not, based on the new and the current properties. In the React.js lifecycle you can do this with the shouldComponentUpdate.

React.js luckily has a mixin called PureRenderMixin which compares the new incoming properties with the previous one and stops rendering when it's the same. It uses the shouldComponentUpdate method internally.
That’s nice, but PureRenderMixin can't compare objects properly. It checks reference equality (===) which will be false for different objects with the same data:

boolean shouldComponentUpdate(object nextProps, object nextState)

If shouldComponentUpdate returns false, then render() will be skipped until the next state change. (In addition, componentWillUpdate and componentDidUpdate will not be called.)

var a = { foo: 'bar' };  
var b = { foo: 'bar' };

a === b; // false  

The problem here is that the components will be re-rendered for the same data if we pass it as a new object (because of the different object reference). But it also not gonna fly if we change the original Object because:

var a = { foo: 'bar' };  
var b = a;  
b.foo = 'baz';  
a === b; // true  

Sure it won't be hard to write a mixin that does deep object comparisons instead of reference checking, but React.js calls shouldComponentUpdate frequently and deep checking is expensive: you should avoid it.

I recommend to check out the advanced Performance with React.js article by Facebook.

Immutability

The problem starts escalating quickly if our application state is a single, big, nested object like our Flux store.
We would like to keep the object reference the same when it doesn't change and have a new object when it is. This is exactly what Immutable.js does.

Immutable data cannot be changed once created, leading to much simpler application development, no defensive copying, and enabling advanced memoization and change detection techniques with simple logic.

Check the following code snippet:

var stateV1 = Immutable.fromJS({  
  users: [
    { name: 'Foo' },
    { name: 'Bar' }
  ]
});

var stateV2 = stateV1.updateIn(['users', 1], function () {  
  return Immutable.fromJS({
    name: 'Barbar'
  });
});

stateV1 === stateV2; // false  
stateV1.getIn(['users', 0]) === stateV2.getIn(['users', 0]); // true  
stateV1.getIn(['users', 1]) === stateV2.getIn(['users', 1]); // false  

As you can see we can use === to compare our objects by reference, which means that we have a super fast way for object comparison, and it's compatible with React's PureRenderMixin. According to this we should write our entire application with Immutable.js. Our Flux Store should be an immutable object, and we pass immutable data as properties to our applications.

Now let's go back to the previous code snippet for a second and imagine that our application component hierarchy looks like this:

User state

You can see that only the red ones will be re-rendered after the change of the state because the others have the same reference as before. It means the root component and one of the users will be re-rendered.

With immutability, we optimized the rendering path and supercharged our app. With virtual DOM, it makes the "React.js way" to a blazing fast application architecture.

Learn more about how persistent immutable data structures work and watch the Immutable Data and React talk from the React.js Conf 2015.

Check out the example repository with a ES6, flux architecture, and immutable.js:
https://github.com/RisingStack/react-way-immutable-flux

The React.js Way: Getting Started Tutorial

Update: the second part is out! Learn more about the React.js way in the second part of the series: Flux Architecture with Immutable.js.

Now that the popularity of React.js is growing blazing fast and lots of interesting stuff are coming, my friends and colleagues started asking me more about how they can start with React and how they should think in the React way.

React.js Tutorial Google Trends (Google search trends for React in programming category, Initial public release: v0.3.0, May 29, 2013)

However, React is not a framework; there are concepts, libraries and principles that turn it into a fast, compact and beautiful way to program your app on the client and server side as well.

In this two-part blog series React.js tutorial I am going to explain these concepts and give a recommendation on what to use and how. We will cover ideas and technologies like:

  • ES6 React
  • virtual DOM
  • Component-driven development
  • Immutability
  • Top-down rendering
  • Rendering path and optimization
  • Common tools/libs for bundling, ES6, request making, debugging, routing, etc.
  • Isomorphic React

And yes, we will write code. I would like to make it as practical as possible.
All the snippets and post related code are available in the RisingStack GitHub repository .

This article is the first from those two. Let's jump in!

Repository:
https://github.com/risingstack/react-way-getting-started

1. Getting Started with the React.js Tutorial

If you are already familiar with React and you understand the basics, like the concept of virtual DOM and thinking in components, then this React.js tutorial is probably not for you. We will discuss intermediate topics in the upcoming parts of this series. It will be fun, I recommend you to check back later.

Is React a framework?

In a nutshell: no, it's not.
Then what the hell is it and why everybody is so keen to start using it?

React is the "View" in the application, a fast one. It also provides different ways to organize your templates and gets you think in components. In a React application, you should break down your site, page or feature into smaller pieces of components. It means that your site will be built by the combination of different components. These components are also built on the top of other components and so on. When a problem becomes challenging, you can break it down into smaller ones and solve it there. You can also reuse it somewhere else later. Think of it like the bricks of Lego. We will discuss component-driven development more deeply in this article later.

React also has this virtual DOM thing, what makes the rendering super fast but still keeps it easily understandable and controllable at the same time. You can combine this with the idea of components and have the power of top-down rendering. We will cover this topic in the second article.

Ok I admit, I still didn't answer the question. We have components and fast rendering - but why is it a game changer? Because React is mainly a concept and a library just secondly.
There are already several libraries following these ideas - doing it faster or slower - but slightly different. Like every programming concept, React has it’s own solutions, tools and libraries turning it into an ecosystem. In this ecosystem, you have to pick your own tools and build your own ~framework. I know it sounds scary but believe me, you already know most of these tools, we will just connect them to each other and later you will be very surprised how easy it is. For example for dependencies we won't use any magic, rather Node's require and npm. For the pub-sub, we will use Node's EventEmitter and as so on.

(Facebook announced Relay their framework for React at the React.js Conf in January 2015. But it's not available yet. The date of the first public release is unknown.)

Are you excited already? Let's dig in!

The Virtual DOM concept in a nutshell

To track down model changes and apply them on the DOM (alias rendering) we have to be aware of two important things:

  1. when data has changed,
  2. which DOM element(s) to be updated.

For the change detection (1) React uses an observer model instead of dirty checking (continuous model checking for changes). That’s why it doesn't have to calculate what is changed, it knows immediately. It reduces the calculations and make the app smoother. But the really cool idea here is how it manages the DOM manipulations:

For the DOM changing challenge (2) React builds the tree representation of the DOM in the memory and calculates which DOM element should change. DOM manipulation is heavy, and we would like to keep it at the minimum. Luckily, React tries to keep as much DOM elements untouched as possible. Given the less DOM manipulation can be calculated faster based on the object representation, the costs of the DOM changes are reduced nicely.

Node.js Monitoring and Debugging from the Experts of RisingStack

Build performant backends for React applications using Trace
Start my free trial

Since React's diffing algorithm uses the tree representation of the DOM and re-calculates all subtrees when its’ parent got modified (marked as dirty), you should be aware of your model changes, because the whole subtree will be re-rendered then.
Don't be sad, later we will optimize this behavior together. (spoiler: with shouldComponentUpdate() and ImmutableJS)

React.js Tutorial React re-render (source: React’s diffing algorithm - Christopher Chedeau)

How to render on the server too?

Given the fact, that this kind of DOM representation uses fake DOM, it's possible to render the HTML output on the server side as well (without JSDom, PhantomJS etc.). React is also smart enough to recognize that the markup is already there (from the server) and will add only the event handlers on the client side.

Interesting: React's rendered HTML markup contains data-reactid attributes, which helps React tracking DOM nodes.

Useful links, other virtual DOM libraries

Component-driven development

It was one of the most difficult parts for me to pick up when I was learning React.
In the component-driven development, you won't see the whole site in one template.
In the beginning you will probably think that it sucks. But I'm pretty sure that later you will recognize the power of thinking in smaller pieces and work with less responsibility. It makes things easier to understand, to maintain and to cover with tests.

How should I imagine it?

Check out the picture below. This is a possible component breakdown of a feature/site. Each of the bordered areas with different colors represents a single type of component. According to this, you have the following component hierarchy:

  • FilterableProductTable

What should a component contain?

First of all it’s wise to follow the single responsibility principle and ideally, design your components to be responsible for only one thing. When you start to feel you are not doing it right anymore with your component, you should consider breaking it down into smaller ones.

Since we are talking about component hierarchy, your components will use other components as well. But let's see the code of a simple component in ES5:

var HelloComponent = React.createClass({  
    render: function() {
        return <div>Hello {this.props.name}</div>;
    }
});

But from now on, we will use ES6. ;)
Let’s check out the same component in ES6:

class HelloComponent extends React.Component {  
  render() {
    return <div>Hello {this.props.name}</div>;
  }
}

JS, JSX

As you can see, our component is a mix of JS and HTML codes. Wait, what? HTML in my JavaScript? Yes, probably you think it's strange, but the idea here is to have everything in one place. Remember, single responsibility. It makes a component extremely flexible and reusable.

In React, it's possible to write your component in pure JS like:

  render () {
    return React.createElement("div", null, "Hello ",
        this.props.name);
  }

But I think it's not very comfortable to write your HTML in this way. Luckily we can write it in a JSX syntax (JavaScript extension) which let us write HTML inline:

  render () {
    return <div>Hello {this.props.name}</div>;
  }

What is JSX?
JSX is a XML-like syntax extension to ECMAScript. JSX and HTML syntax are similar but it’s different at some point. For example the HTML class attribute is called className in JSX. For more differences and gathering deeper knowledge check out Facebook’s HTML Tags vs. React Components guide.

Because JSX is not supported in browsers by default (maybe someday) we have to compile it to JS. I'll write about how to use JSX in the Setup section later. (by the way Babel can also transpile JSX to JS).

Useful links about JSX:
- JSX in depth
- Online JSX compiler
- Babel: How to use the react transformer.

What else can we add?

Each component can have an internal state, some logic, event handlers (for example: button clicks, form input changes) and it can also have inline style. Basically everything what is needed for proper displaying.

You can see a {this.props.name} at the code snippet. It means we can pass properties to our components when we are building our component hierarchy. Like: <MyComponent name="John Doe" />
It makes the component reusable and makes it possible to pass our application state from the root component to the child components down, through the whole application, always just the necessary part of the data.

Check this simple React app snippet below:

class UserName extends React.Component {  
  render() {
    return <div>name: {this.props.name}</div>;
  }
}

class User extends React.Component {  
  render() {
    return <div>
        <h1>City: {this.props.user.city}</h1>
        <UserName name={this.props.user.name} />
      </div>;
  }
}

var user = { name: 'John', city: 'San Francisco' };  
React.render(<User user={user} />, mountNode);

Useful links for building components:
- Thinking in React

React loves ES6

ES6 is here and there is no better place for trying it out than your new shiny React project.

React wasn't born with ES6 syntax, the support came this year January, in version v0.13.0.

However the scope of this article is not to explain ES6 deeply; we will use some features from it, like classes, arrows, consts and modules. For example, we will inherit our components from the React.Component class.

Given ES6 is supported partly by browsers, we will write our code in ES6 but transpile it to ES5 later and make it work with every modern browser even without ES6 support.
To achieve this, we will use the Babel transpiler. It has a nice compact intro about the supported ES6 features, I recommend to check it out: Learn ES6

Useful links about ES6
- Babel: Learn ES6
- React ES6 announcement

Bundling with Webpack and Babel

I mentioned earlier that we will involve tools you are already familiar with and build our application from the combination of those. The first tool what might be well known is the Node.js's module system and it's package manager, npm. We will write our code in the "node style" and require everything what we need. React is available as a single npm package.
This way our component will look like this:

// would be in ES5: var React = require('react/addons');
import React from 'react/addons';

class MyComponent extends React.Component { ... }

// would be in ES5: module.exports = MyComponent;
export default MyComponent;  

We are going to use other npm packages as well.
Most npm packages make sense on the client side as well,
for example we will use debug for debugging and superagent for composing requests.

Now we have a dependency system by Node (accurately ES6) and we have a solution for almost everything by npm. What's next? We should pick our favorite libraries for our problems and bundle them up in the client as a single codebase. To achieve this, we need a solution for making them run in the browser.

This is the point where we should pick a bundler. One of the most popular solutions today are Browserify and Webpack projects. Now we are going to use Webpack, because my experience is that Webpack is more preferred by the React community. However, I'm pretty sure that you can do the same with Browserify as well.

How does it work?

Webpack bundles our code and the required packages into the output file(s) for the browser. Since we are using JSX and ES6 which we would like to transpile to ES5 JS, we have to place the JSX and ES6 to ES5 transpiler into this flow as well. Actually, Babel can do the both for us. Let's just use that!

We can do that easily because Webpack is configuration-oriented

What do we need for this? First we need to install the necessary modules (starts with npm init if you don't have the package.json file yet).

Run the following commands in your terminal (Node.js or IO.js and npm is necessary for this step):

npm install --save-dev webpack  
npm install --save-dev babel  
npm install --save-dev babel-loader  

After we created the webpack.config.js file for Webpack (It's ES5, we don't have the ES6 transpiler in the webpack configuration file):

var path = require('path');

module.exports = {  
  entry: path.resolve(__dirname, '../src/client/scripts/client.js'),
  output: {
    path: path.resolve(__dirname, '../dist'),
    filename: 'bundle.js'
  },

  module: {
    loaders: [
      {
        test: /src\/.+.js$/,
        exclude: /node_modules/,
        loader: 'babel'
      }
    ]
  }
};

If we did it right, our application starts at ./src/scripts/client/client.js and goes to the ./dist/bundle.js for the command webpack.

After that, you can just include the bundle.js script into your index.html and it should work:
<script src="bundle.js"></script>

(Hint: you can serve your site with node-static install the module with, npm install -g node-static and start with static . to serve your folder's content on the address: 127.0.0.1:8080.)

Project setup

Now we have installed and configured Webpack and Babel properly.
As in every project, we need a project structure.

Folder structure

I prefer to follow the project structure below:

config/  
    app.js
    webpack.js (js config over json -> flexible)
src/  
  app/ (the React app: runs on server and client too)
    components/
      __tests__ (Jest test folder)
      AppRoot.jsx
      Cart.jsx
      Item.jsx
    index.js (just to export app)
    app.js
  client/  (only browser: attach app to DOM)
    styles/
    scripts/
      client.js
    index.html
  server/
    index.js
    server.js
.gitignore
.jshintrc
package.json  
README.md  

The idea behind this structure is to separate the React app from the client and server code. Since our React app can run on both client and server side (=isomorphic app, we will dive deep into this in an upcoming blog post).

How to test my React app

When we are moving to a new technology, one of the most important questions should be testability. Without a good test coverage, you are playing with fire.

Ok, but which testing framework to use?
My experience is that testing a front end solution always works best with the test framework by the same creators. According to this I started to test my React apps with Jest. Jest is a test framework by Facebook and has many great features that I won't cover in this article.

I think it's more important to talk about the way of testing a React app. Luckily the single responsibility forces our components to do only one thing, so we should test only that thing. Pass the properties to our component, trigger the possible events and check the rendered output. Sounds easy, because it is.

For more practical example, I recommend checking out the Jest React.js tutorial.

Test JSX and ES6 files

To test our ES6 syntax and JSX files, we should transform them for Jest. Jest has a config variable where you can define a preprocessor (scriptPreprocessor) for that.
First we should create the preprocessor and after that pass the path to it for Jest. You can find a working example for a Babel Jest preprocessor in our repository.

Jet’s also has an example for React ES6 testing.

(The Jest config goes to the package json.)

Takeaway

In this article, we examined together why React is fast and scalable but how different its approach is. We went through how React handles the rendering and what the component-driven development is and how should you set up and organize your project. These are the very basics.

In the upcoming "The React way" articles we are going to dig deeper.

I still believe that the best way to learn a new programming approach is to start develop and write code.
That’s why I would like to ask you to write something awesome and also spend some time to check out the offical React website, especially the guides section. Excellent resource, the Facebook developers, and the React community did an awesome job with it.

Next up

If you liked this article, subscribe to our newsletter for more. The remaining part of the The React way post series are coming soon. We will cover topics like:

  • immutability
  • top-down rendering
  • Flux
  • isomorphic way (common app on client and server)

Feel free to check out the repository:
https://github.com/RisingStack/react-way-getting-started

Update: the second part is out! Learn more about the React.js way in the second part of the series: Flux Architecture with Immutable.js.

Flux inspired libraries with React

There are lots of flux or flux inspired libraries out there: they try to solve different kind of problems, but which one should you use? This article tries to give an overview on the different approaches.

What is Flux? (the original)

An application architecture for React utilizing a unidirectional data flow. - flux

Ok, but why?

Flux tries to avoid the complex cross dependencies between your modules (MVC for example) and realize a simple one-way data flow. This helps you to write scalable applications and avoid side effects in your application.

Read more about this and about the key properties of Flux architecture at Fluxxor's documentation.

Original flux

Facebook's original Flux has four main components:
singleton Dispatcher, Stores, Actions and Views (or controller-view)

Dispatcher

From the Flux overview:

The dispatcher is the central hub that manages all data flow in a Flux application.

In details:

It is essentially a registry of callbacks into the stores.
Each store registers itself and provides a callback. When the dispatcher responds to an action, all stores in the application are sent the data payload provided by the action via the callbacks in the registry.

Actions

Actions can have a type and a payload. They can be triggered by the Views or by the Server (external source). Actions trigger Store updates.

Facts about Actions:

  • Actions should be descriptive:

    The action (and the component generating the action) doesn't know how to perform the update, but describes what it wants to happen. - Semantic Actions

  • But shouldn't perform an other Action: No Cascading Actions

  • About Actions dispatches

    Action dispatches and their handlers inside the stores are synchronous. All asynchronous operations should trigger an action dispatch that tells the system about the result of the operation - Enforced Synchrony

Later you will see that Actions can be implemented and used in different ways.

Stores

Stores contain the application state and logic.

Every Store receives every action from the Dispatcher but a single store handles only the specified events. For example, the User store handles only user specific actions like createUser and avoid the other actions.

After the store handled the Action and it's updated, the Store broadcasts a change event. This event will be received by the Views.

Store shouldn't be updated externally, the update of the Store should be triggered internally as a response to an Action: Inversion of Control.

Views

Views are subscribed for one or multiple Stores and handle the store change event.
When a store change event is received, the view will get the data from the Store via the Store's getter functions. Then the View will render with the new data.

Steps:
1. Store change event received
2. Get data from the Store via getters
3. Render view

FB Flux

You can find several Flux implementations on GitHub, the most popular libraries are the followings:

Beyond Flux

Lots of people think that Flux could be more reactive and I can just agree with them.
Flux is a unidirectional data flow which is very similar to the event streams.

Now let's see some other ways to have something Flux-like but being functional reactive at the same time.

Reflux

Reflux has refactored Flux to be a bit more dynamic and be more Functional Reactive Programming (FRP) friendly - refluxjs

Reflux is a more reactive Flux implementation by @spoike because he found the original one confusing and broken at some points: Deconstructing ReactJS's Flux

The biggest difference between Flux and Reflux is that there is no centralized dispatcher.

Actions are functions which can pass payload at call. Actions are listenable and Stores can subscribe for them. In Reflux every action act as dispatcher.

Reflux

Reflux provides mixins for React to listen on stores changes easily.
It provides support for async and sync actions as well. It's also easy to handle async errors with Reflux.

In Reflux, stores can listen to other stores in serial and parallel way which sounds useful but it increases the cross dependencies between your stores. I'm afraid you can easily find yourself in a middle of circular dependency.

A problem arises if we create circular dependencies. If Store A waits for Store B, and B waits for A, then we'll have a very bad situation on our hands. - flux

Update

There is a circular dependency check for some cases in reflux implemented and is usually not an issue as long as you think of data flows with actions as initiators of data flows and stores as transformations. - @spoike

rx-flux

The Flux architecture allows you to think your application as an unidirectional flow of data, this module aims to facilitate the use of RxJS Observable as basis for defining the relations between the different entities composing your application. - rx-flux

rx-flux is a newcomer and uses RxJS, the reactive extension to implement a unidirectional data flow.

rx-flux is more similar to Reflux than to the original Flux (from the readme):

  • A store is an RxJS Observable that holds a value
  • An action is a function and an RxJS Observable
  • A store subscribes to an action and updates accordingly its value.
  • There is no central dispatcher.

When the Stores and Actions are RxJS Observables you can use the power of Rx to handle your application business logic in a Functional Reactive way which can be extremely useful in asynchronous situations.

If you don't like Rx, there are similar projects with Bacon.js like fluxstream or react-bacon-flux-poc.

If you like the concept of FRP, I recommend you to read @niklasvh's article about how he combined Immutable.js and Bacon.js to have a more reactive Flux implementation: Flux inspired reactive data flow using React and Bacon.js
niklasvh's example code for lazy people: flux-todomvc

Omniscient

Omniscient is a really different approach compared to Flux. It uses the power of the Facebook's Immutable.js to speed up the rendering. It renders only when the data is really changed. This kind of optimized call of the render function (React) can help us to build performant web applications.

Rendering is already optimized in React with the concept of Virtual DOM, but it checks the DOM diffs what is also computation heavy. With Omniscient you can reduce the React calls and avoid the diff calculations.

What? / Example:
Imagine the following scenario: the user's name is changed, what will happen in Flux and what in Omniscient?
In Flux every user related view component will be re-rendered because they are subscribed to the user Store which one broadcasts a change event.
In Omniscient, only components will be re-rendered which are using the user's name cursor.
Of course it's possible to diverse Flux with multiple Stores, but most of the cases it doesn't make any sense to store the name in a different store.

Omniscient is for React, but actually it's just a helper for React and the real power comes from Immstruct what can be used without Omniscient with other libraries like virtual-dom.

It may not be trivial what Omniscient does. I think this todo example can help the most.

You can find a more complex demo here: demo-reactions

It would be interesting to hear what companies are using Omniscient in production.
If you do so, I would love to hear from you!

Further reading

The State of Flux
Flux inspired reactive data flow using React and Bacon.js
Deconstructing ReactJS's Flux
React + RxJS + Angular 2.0's di.js TodoMVC Example by @joelhooks