Graceful shutdown with Node.js and Kubernetes

RisingStack's services:


Node.js Experts

Learn more at

Sign up to our newsletter!

In this article:

This article helps you to understand what graceful shutdown is, what are the main benefits of it and how can you set up the graceful shutdown of a Kubernetes application. We’ll discuss how you can validate and benchmark this process, and what are the most common mistakes that you should avoid.

Graceful shutdown

We can speak about the graceful shutdown of our application, when all of the resources it used and all of the traffic and/or data processing what it handled are closed and released properly.

It means that no database connection remains open and no ongoing request fails because we stop our application.

Possible scenarios for a graceful web server shutdown:

  1. App gets notification to stop (received SIGTERM)
  2. App lets know the load balancer that it’s not ready for newer requests
  3. App served all the ongoing requests
  4. App releases all of the resources correctly: DB, queue, etc.
  5. App exits with “success” status code (process.exit())

This article goes deep with shutting down web servers properly, but you should also apply these techniques to your worker processes: it’s highly recommended to stop consuming queues for SIGTERM and finish the current task/job.

Why is it important?

If we don’t stop our application correctly, we are wasting resources like DB connections and we may also break ongoing requests. An HTTP request doesn’t recover automatically – if we fail to serve it, then we simply missed it.

Graceful start

We should only start our application when all of the dependencies and database connections are ready to handle our traffic.

Possible scenarios for a graceful web server start:

  1. App starts (npm start)
  2. App opens DB connections
  3. App listens on port
  4. App tells the load balancer that it’s ready for requests

Graceful shutdown in a Node.js application

First of all, you need to listen for the SIGTERM signal and catch it:

process.on('SIGTERM', function onSigterm () {'Got SIGTERM. Graceful shutdown start', new Date().toISOString())
  // start graceul shutdown here

After that, you can close your server, then close your resources and exit the process:

function shutdown() {
  server.close(function onServerClosed (err) {
    if (err) {

    closeMyResources(function onResourcesClosed (err) {
      // error handling

Sounds easy right? Maybe a little bit too easy.

What about the load balancer? How will it know that your app is not ready to receive further requests anymore? What about keep-alive connections? Will they keep the server open for a longer time? What if my server SIGKILL my app in the meantime?

Graceful shutdown with Kubernetes

If you’d like to learn a little bit about Kubernetes, you can read our Moving a Node.js app from PaaS to Kubernetes Tutorial. For now, let’s just focus on the shutdown now.

Kubernetes comes with a resource called Service. Its job is to route traffic to your pods (~instances of your app). Kubernetes also comes with a thing called Deployment that describes how your applications should behave during exit, scale and deploy – and you can also define a health check here. We will combine these resources for the perfect graceful shutdown and handover during new deploys at high traffic.

We would like to see throughput charts like below with consistent rpm and no deployment side effects at all:

Graceful shutdown example: Throughput time in Trace by RisingStack
Throughput metrics shown in Trace – no change at deploy

Ok, let’s see how to solve this challenge.

Setting up graceful shutdown

In Kubernetes, for a proper graceful shutdown we need to add a readinessProbe to our application’s Deployment yaml and let the Service’s load balancer know during the shutdown that we will not serve more requests so it should stop sending them. We can close the server, tear down the DB connections and exit only after that.

How does it work?

Kubernetes graceful shutdown flowchart
  1. pod receives SIGTERM signal because Kubernetes wants to stop it – because of deploy, scale, etc.
  2. App (pod) starts to return 500 for GET /health to let readinessProbe (Service) know that it’s not ready to receive more requests.
  3. Kubernetes readinessProbe checks GET /health and after (failureThreshold * periodSecond) it stops redirecting traffic to the app (because it continuously returns 500)
  4. App waits (failureThreshold * periodSecond) before it starts to shut down – to make sure that the Service is getting notified via readinessProbe fail
  5. App starts graceful shutdown
  6. App first closes server with live working DB connections
  7. App closes databases after the server is closed
  8. App exits process
  9. Kubernetes force kills the application after 30s (SIGKILL) if it’s still running (in an optimal case it doesn’t happen)

In our case, the Kubernetes livenessProbe won’t kill the app before the graceful shutdown happens because it needs to wait (failureThreshold * periodSecond) to do it.

This means that the livenessProve threshold should be larger than the readinessProbe threshold. This way the (graceful stop happens around 4s, while the force kill would happen 30s after SIGTERM).

How to achieve it?

For this we need to do two things, first we need to let the readinessProbe know after SIGTERM that we are not ready anymore:

'use strict'

const db = require('./db')
const promiseTimeout = require('./promiseTimeout')
const state = { isShutdown: false }

process.on('SIGTERM', function onSigterm () {
  state.isShutdown = true

function get (req, res) {
  // SIGTERM already happened
  // app is not ready to serve more requests
  if (state.isShutdown) {
    return res.end('not ok')

  // something cheap but tests the required resources
  // timeout because we would like to log before livenessProbe KILLS the process
  promiseTimeout(, TIMEOUT_IN_MILLIS)
    .then(() => {
      // success health
      return res.end('ok')
    .catch(() => {
      // broken health
      return res.end('not ok')

module.exports = {
  get: get

The second thing is that we have to delay the teardown process – as a sane default you can use the time needed for two failed readinessProbefailureThreshold: 2 * periodSeconds: 2 = 4s

process.on('SIGTERM', function onSigterm () {'Got SIGTERM. Graceful shutdown start', new Date().toISOString())

  // Wait a little bit to give enough time for Kubernetes readiness probe to fail 
  // (we are not ready to serve more traffic)
  // Don't worry livenessProbe won't kill it until (failureThreshold: 3) => 30s
  setTimeout(greacefulStop, READINESS_PROBE_DELAY)

You can find the full example here:

How to validate it?

Let’s test our graceful shutdown by sending high traffic to our pods and releasing a new version in the meantime (recreating all of the pods).

Test case

$ ab -n 100000 -c 20 http://localhost:myport

Other than this, you need to change an environment variable in the Deployment to re-create all pods during the ab benchmarking.

AB output

Document Path:          /
Document Length:        3 bytes

Concurrency Level:      20
Time taken for tests:   172.476 seconds
Complete requests:      100000
Failed requests:        0
Total transferred:      7800000 bytes
HTML transferred:       300000 bytes
Requests per second:    579.79 [#/sec] (mean)
Time per request:       34.495 [ms] (mean)
Time per request:       1.725 [ms] (mean, across all concurrent requests)
Transfer rate:          44.16 [Kbytes/sec] received

Application log output

Got SIGTERM. Graceful shutdown start 2016-10-16T18:54:59.208Z
Request after sigterm: / 2016-10-16T18:54:59.217Z
Request after sigterm: / 2016-10-16T18:54:59.261Z
Request after sigterm: / 2016-10-16T18:55:00.064Z
Request after sigterm: /health?type=readiness 2016-10-16T18:55:00.820Z
Request after sigterm: /health?type=readiness 2016-10-16T18:55:02.784Z
Request after sigterm: /health?type=liveness 2016-10-16T18:55:04.781Z
Request after sigterm: /health?type=readiness 2016-10-16T18:55:04.800Z
Server is shutting down... 2016-10-16T18:55:05.210Z
Successful graceful shutdown 2016-10-16T18:55:05.212Z

Benchmark result


Zero failed requests: you can see in the app log that the Service stopped sending traffic to the pod before we disconnected from the DB and killed the app.

Common gotchas

The following mistakes can still prevent your app from doing a proper graceful shutdown:

Keep-alive connections

Kubernetes doesn’t handover keep-alive connections properly. :/

This means that request from agents with a keep-alive header will still be routed to the pod.

It tricked me first when I benchmarked with autocannon or Google Chrome (they use keep-alive connections).

Keep-alive connections prevent closing your server in time. To force the exit of a process, you can use the server-destroy stoppable module. Once it ran, you can be sure that all the ongoing requests are served. Alternatively you can adda timeout logic to your server.close(cb).

UPDATE: server-destroy cuts running connections without allowing us to define a grace period, essentially failing the whole purpose.

Docker signaling

It’s quite possible that your application doesn’t receive the signals correctly in a dockerized application.

For example in our Alpine image: CMD ["node", "src"] works, CMD ["npm", "start"] doesn’t. It simply doesn’t pass the SIGTERM to the node process. The issue is probably related to this PR:

An alternative you can use is dumb-init for fixing broken Docker signaling.


Always be sure that your application stops correctly: It releases all of the resources and helps to hand over the traffic to the new version of your app.

Check out our example repository with Node.js and Kubernetes:

If you have any questions or thoughts about this topic, find me in the comment section below!

Share this post



Learn more at

Node.js Experts