Hírek, események

Download & Update Node.js to the Latest Version! Node v21.1.0 Current / LTS v20.9.0 Direct Links

Node 20 is the active LTS version which will be supported until 22 Oct 2024, while Node 21 became the Current version in 2023 October: https://blog.risingstack.com/nodejs-21/

Node 21 CURRENT2023. Oct 17.2024. Apr 1.2024. Jun
Node 20 LTS2023. Apr 18.2024. Oct 22.2026. Apr 3020.9.0

In this article below, you’ll find changelogs and download / update information regarding Node.js!

Node.js LTS & Current Download for macOS:

Node.js LTS & Current Download for Windows:

For other downloads like Linux libraries, source codes, Docker images, etc.. please visit https://nodejs.org/en/download/

Node.js Release Schedule:


Node.js v21 is the next Current version!

The latest major version of Node.js has just released with a few new interesting experimental features and a lot of fixes and optimization. You can find our highlights in this article: https://blog.risingstack.com/nodejs-21/

  • Built-in WebSocket client:
    A browser-compatible WebSocket implementation has been added to Node.js with this new release as an experimental feature. You can give it a go using the --experimental-websocket flag. The current implementation allows for opening and closing of websocket connections and sending data.

  • flush option for the writeFile type filesystem functions:
    Up until now, it was possible for data to not be flushed immediately to permanent storage when a write operation completed successfully, allowing read operations to get stale data. In response, a flush option has been added to the fs module file writing functions that, when enabled, forces data to be flushed at the end of a successful write operation using sync.

  • Addition of a global navigator Object:
    This new release also introduces a global navigator object to take steps towards enhancing web interoperability. We can now access hardware concurrency information through navigator.hardwareConcurrency, the only currently implemented method on the object.

  • Array grouping:
    There is a new static method added to Object and MapgroupBy(), that groups the items of a given iterable according to a provided callback function.

  • Additional changes:
    • Both the fetch and the webstreams modules are now marked as stable after receiving a few changes with this version.
    • A host of performance improvements as usual with any new release.
    • WebAssembly gets extended const expressions
    • Another new experimental flag, --experimental-default-type, has been added that allows setting the default module type to ESM
    • The globalPreload hook has been removed, it’s functionality replaced by register and initialize
    • Glob patterns are now supported in the test runner

Learn More Node.js from RisingStack

At RisingStack we’ve been writing JavaScript / Node tutorials for the community in the past 5 years. If you’re beginner to Node.js, we recommend checking out our Node Hero tutorial series! The goal of this series is to help you get started with Node.js and make sure you understand how to write an application using it.

See all chapters of the Node Hero tutorial series:
  1. Getting Started with Node.js
  2. Using NPM
  3. Understanding async programming
  4. Your first Node.js HTTP server
  5. Node.js database tutorial
  6. Node.js request module tutorial
  7. Node.js project structure tutorial
  8. Node.js authentication using Passport.js
  9. Node.js unit testing tutorial
  10. Debugging Node.js applications
  11. Node.js Security Tutorial
  12. How to Deploy Node.js Applications
  13. Monitoring Node.js Applications

As a sequel to Node Hero, we have completed another series called Node.js at Scale – which focuses on advanced Node / JavaScript topics. Take a look!

Node.js 21 is here with Websocket

The latest major version of Node.js has just released with a few new interesting experimental features and a lot of fixes and optimization. You can find our highlights from the release notes.

Built-in WebSocket client

A browser-compatible WebSocket implementation has been added to Node.js with this new release as an experimental feature. You can give it a go using the --experimental-websocket flag. The current implementation allows for opening and closing of websocket connections and sending data. There are four events available for use: open, close, message and error – so the basics are covered. It’s pretty exciting to see an out-of-the-box websocket implementation coming to Node, it could spare us the inclusion of yet another library in projects that need bidirectional communication. Be sure to give it a go and give your feedback to the developers!

A flush option for the writeFile type filesystem functions

Up until now, it was possible for data to not be flushed immediately to permanent storage when a write operation completed successfully, allowing read operations to get stale data. In response, a flush option has been added to the fs module file writing functions that, when enabled, forces data to be flushed at the end of a successful write operation using sync. This feature is not enabled by default, so make sure to include { flush: true } in the options if you’d like to use it.

Here is the list of functions the flush option has been added to:

  • filehandle.createWriteStream
  • fsPromises.writeFile
  • fs.createWriteStream
  • fs.writeFile
  • fs.writeFileSync

Addition of a global navigator Object

This new release also introduces a global navigator object to take steps towards enhancing web interoperability. We can now access hardware concurrency information through navigator.hardwareConcurrency, the only currently implemented method on the object. While this might not seem like a huge change for now, we can assume more and more functionality will be implemented with time, until we have the whole suit of information window.navigator provides in browser environments. This would spare us having to decide between using process and navigator in our code that is to be ran in both a browser and in Node.js.

Array grouping

There is a new static method added to Object and MapgroupBy(), that groups the items of a given iterable according to a provided callback function. The object returned contains a property for each group, whose value is an array with the items that belong to the group. In case of Object, the keys of the returned object will be strings, while the version on Map can have any kind of key.

Additional changes

  • Both the fetch and the webstreams modules are now marked as stable after receiving a few changes with this version.
  • A host of performance improvements as usual with any new release.
  • WebAssembly gets extended const expressions
  • Another new experimental flag, --experimental-default-type, has been added that allows setting the default module type to ESM
  • The globalPreload hook has been removed, it’s functionality replaced by register and initialize
  • Glob patterns are now supported in the test runner

Don’t forget that Node.js 16 is at its end of life, so if you’re still using this version, you should plan to upgrade soon to one of the newer LTS versions as soon as possible! The currently active LTS releases are 18 and 20, with version 22 – that is also an LTS version – scheduled to release in April 2024. You can find more information about the release schedule here.

The Best JavaScript Frameworks: Pros and Cons Explained

There are a lot of different JavaScript frameworks out there, and it can be tough to keep track of them all. In this article, we’ll focus on the most popular ones, and explore why they’re either loved or disliked by developers.


React is a JavaScript library for building user interfaces. It is maintained by Facebook and a community of individual developers and companies. React can be used as a base in the development of single-page or mobile applications. However, React is only concerned with rendering data to the DOM, and so creating React apps usually requires the use of additional libraries for state management, routing, and interaction with an API. React is also used for building reusable UI components. In that sense, it works much like a JavaScript framework such as Angular or Vue. However, React components are typically written in a declarative manner rather than using imperative code, making them easier to read and debug. Because of this, many developers prefer to use React for building UI components even if they are not using it as their entire front-end framework.


  • React is fast and efficient because it uses a virtual DOM rather than manipulating the real DOM.
  • React is easy to learn because of its declarative syntax and clear documentation.
  • React components are reusable, making code maintenance easier.


  • React has a large learning curve because it is a complex JavaScript library.
  • React is not a full-fledged framework, and so it requires the use of additional libraries for many tasks.


Next.js is a javascript library that enables server-side rendering for React applications. This means that next.js can render your React application on the server before sending it to the client. This has several benefits. First, it allows you to pre-render components so that they are already available on the client when the user requests them. Second, it enables better SEO for your React application by allowing crawlers to index your content more easily. Finally, it can improve performance by reducing the amount of work that the client has to do in order to render the page.

Here’s why developers like Next.js: 

  • Next.js makes it easy to get started with server-side rendering without having to do any configuration.
  • Next.js automatically code splits your application so that each page is only loaded when it is requested, which can improve performance.


  • If you’re not careful, next.js can make your application codebase more complex and harder to maintain.
  • Some developers find the built-in features of next.js to be opinionated and inflexible.


Vue.js is an open-source JavaScript framework for building user interfaces and single-page applications. Unlike other frameworks such as React and Angular, Vue.js is designed to be lightweight and easy to use. The Vue.js library can be used in conjunction with other libraries and frameworks, or can be used as a standalone tool for creating front-end web applications. One of the key features of Vue.js is its two-way data binding, which automatically updates the view when the model changes, and vice versa. This makes it an ideal choice for building dynamic user interfaces. In addition, Vue.js comes with a number of built-in features such as a templating system, a reactivity system, and an event bus. These features make it possible to create sophisticated applications without having to rely on third-party libraries. As a result, Vue.js has become one of the most popular JavaScript frameworks in recent years.


  • Vue.js is easy to learn due to its small size and clear documentation.
  • Vue.js components are reusable, which makes code maintenance easier.
  • Vue.js applications are very fast due to the virtual DOM and async component loading.


  • While Vue.js is easy to learn, it has a large learning curve if you want to master all its features.
  • Vue.js does not have as many libraries and tools available as some of the other frameworks.


Angular is a JavaScript framework for building web applications and apps in JavaScript, html, and Typescript. Angular is created and maintained by Google. Angular provides two-way data binding, so that changes to the model are automatically propagated to the view. It also provides a declarative syntax that makes it easy to build dynamic UIs. Finally, Angular provides a number of useful built-in services, such as HTTP request handling, and support for routing and templates.


  • Angular has a large community and many libraries and tools available.
  • Angular is easy to learn due to its well-organized documentation and clear syntax.


  • While Angular is easy to learn, it has a large learning curve if you want to master all its features.
  • Angular is not as lightweight as some of the other frameworks.


In a nutshell, Svelte is a JavaScript framework similar to React, Vue, or Angular. However, where those frameworks use virtual DOM (Document Object Model) diffing to figure out what changed between views, Svelte uses a technique called DOM diffing. This means that it only updates the parts of the DOM that have changed, making for a more efficient rendering process. In addition, Svelte also includes some built-in optimizations that other frameworks do not, such as automatically batching DOM updates and code-splitting. These features make Svelte a good choice for high-performance applications.


  • Svelte has built-in optimizations that other frameworks do not, such as code-splitting.
  • Svelte is easy to learn due to its clear syntax and well-organized documentation.


  • While Svelte is easy to learn, it has a large learning curve if you want to master all its features.
  • Svelte does not have as many libraries and tools available as some of the other frameworks.


Gatsby is a free and open-source framework based on React that helps developers build blazing fast websites and apps. It uses cutting edge technologies to make the process of building websites and applications more efficient. One of its key features is its ability to prefetch resources so that they are available instantaneously when needed. This makes Gatsby websites extremely fast and responsive. Another benefit of using Gatsby is that it allows developers to use GraphQL to query data from any source, making it easy to build complex data-driven applications. In addition, Gatsby comes with a number of plugins that make it even easier to use, including ones for SEO, analytics, and image optimization. All of these factors make Gatsby an extremely popular choice for building modern websites and applications.


  • Gatsby websites are extremely fast and responsive due to its use of prefetching.
  • Gatsby makes it easy to build complex data-driven applications due to its support for GraphQL.
  • Gatsby comes with a number of plugins that make it even easier to use.


  • While Gatsby is easy to use, it has a large learning curve if you want to master all its features.
  • Gatsby does not have as many libraries and tools available as some of the other frameworks.


Nuxt.js is a progressive framework for building JavaScript applications. It is based on Vue.js and comes with a set of tools and libraries that make it easy to create universal applications that can be rendered on server-side and client-side. Nuxt.js also provides a way to handle asynchronous data and routing, which makes it perfect for building highly interactive applications. In addition, Nuxt.js comes with a CLI tool that makes it easy to scaffold new projects and build, run, and test them. With Nuxt.js, you can create impressive JavaScript applications that are fast, reliable, and scalable.


  • Nuxt.js is easy to use and extend.
  • Nuxt.js applications are fast and responsive due to server-side rendering.


  • While Nuxt.js is easy to use, it has a large learning curve if you want to master all its features.
  • Nuxt.js does not have as many libraries and tools available as some of the other frameworks.


Ember.js is known for its conventions over configuration approach which makes it easier for developers to get started with the framework. It also features built-in libraries for common tasks such as data persistence and routing which makes development faster.  Although Ember.js has a steep learning curve, it provides developers with a lot of flexibility and power to create rich web applications. If you’re looking for a front-end JavaScript framework to build SPAs, Ember.js is definitely worth considering.


  • Ember.js uses conventions over configuration which makes it easier to get started with the framework.
  • Ember.js has built-in libraries for common tasks such as data persistence and routing.
  • Ember.js provides developers with a lot of flexibility and power to create rich web applications.


  • Ember.js has a steep learning curve.
  • Ember.js does not have as many libraries and tools available as some of the other frameworks.


Backbone.js is a lightweight JavaScript library that allows developers to create single-page applications. It is based on the Model-View-Controller (MVC) architecture, which means that it separates data and logic from the user interface. This makes code more maintainable and scalable, as well as making it easier to create complex applications. Backbone.js also includes a number of features that make it ideal for developing mobile applications, such as its ability to bind data to HTML elements and its support for touch events. As a result, Backbone.js is a popular choice for developers who want to create fast and responsive applications.


  • Backbone.js is lightweight and only a library, not a complete framework.
  • Backbone.js is easy to learn and use.
  • Backbone.js is very extensible with many third-party libraries available.


  • Backbone.js does not offer as much built-in functionality as some of the other frameworks.
  • Backbone.js has a smaller community than some of the other frameworks.


In conclusion, while there are many different JavaScript frameworks to choose from, the most popular ones remain relatively stable. Each has its own benefits and drawbacks that developers must weigh when making a decision about which one to use for their project. While no framework is perfect, each has something to offer that can make development easier or faster. 

Everyone should consider the specific needs of their project when choosing a framework, as well as the skills of their team and the amount of time they have to devote to learning a new framework. By taking all of these factors into account, you can choose the best JavaScript framework for your project!

ChatGPT use case examples for programming

If you’re reading this post, you probably already know enough about large language models and other “AI” tools, so we can skip the intro

Despite the fact that the “AI is going to take our jobs” discourse proved to be an effective tool in the clickbait content creators toolbelt, I will not take this road.

Instead of contributing to the moral panic about the supposedly inevitable replacement of white collar jobs, or pretending to be offended by a chatbot, I’ll help our readers to consider GPT-based products as tools that could be useful in a professional webdev setting. 

To do so, I asked some of my colleagues about their experiences of using GPT and various mutations of it – to help you get a more grounded understanding of their utility.

In case you have an experience that you consider useful sharing with the RisingStack community, please share it though this form. 

I’ll drop the results / best ones in the article later on!

Daniel’s  ‘Code GPT’ vscode plugin review

I’ve been pretty satisfied with GitHub Copilot. It does the job well, and it is priced reasonably. Still, after depleting the free tier, I decided to look for an open source alternative.

TabNine is an honorable mention here, and a well established player, but based on my previous experience (about two years ago, mind you), it is clunky. Nowhere near the breeze of a dev experience you get from Copilot.

But take heart, there is a staggering amount of plugins out there for VS Code, if you look for AI-based coding assistants. 

At the time of writing this, Code GPT is the winner by number of downloads, and number of (positive) votes, so I decided to give it a go. You can choose from a range of OpenAPI and Cohere models, with GPT-3 being the default.


1, Code Generation from comment prompts

The suggestions are relevant, and of quality. The plugin doesn’t offer code completion on the fly, unlike Copilot, but communicates with you in a new IDE pane it opens automatically instead. I like this feature, since I can pick the parts from the suggestion I liked, without bloating the code I’m working on, and having to delete the irrelevant lines. This behavior comes in handy with the other features as well. Let’s see those.

2, Unit Test generation

While the results are often far from being complete, it saves me a lot of boilerplate code. It is also handy in reminding me of cases that I otherwise might have forgotten. For this feature to work well, adjust your max token length to a 1000 at least in the Settings, since a comprehensive test suite usually ends up quite verbose, and you’ll only get part of it with a tight quota.

3, Find Problems

Your code review buddy. Once I feel I’m done with my work, a quick scan doesn’t take long before committing. While it often is straight out wrong about the ‘issues’ it points out, it doesn’t take long to scan through the suggestions, and catch mistakes before your real life reviewer does.

4, Refactor

Save some time for your team lead for extra credits, and run Refactor against your code. Don’t expect miracles to happen, but often times it catches stuff that managed to sneak under your radar. Note: the default max token length won’t cut it here either.

5, Document and Explain

Listed as two separate functionality in the documentation, it achieves essentially the same thing; provides a high level natural language description on what the highlighted peace of code does. I tend to use it less often, but it is a nice to have.

6, Ask CodeGPT

I left it the last, but this is the most flexible feature of this plugin. It can achieve all previously mentioned functionalities with the right prompt, and more. Convert your .js to .ts,  generate a README.md file from code, as suggested in the documentation, or just go ahead and ask for a recipe for a delicious apple pie, like you would from ChatGPT 🥧 

My Conclusion:

Code GPT offers many functionality that Copilot doesn’t, but lacks the thing Copilot is best at: inline code completion. So if you want to take the most out of AI, just use both, as these two tools complement each other really nice.

Code GPT Might come handy if you’re just getting started with a new language or framework. The Explain feature helps double-check your gut feeling, or gives you the missing hint in the right direction.

Bump up your max token length to at least a 1000, c’mon, it’s only ¢2 😉

An interesting alternative I might be trying in the future is ‘ChatGPT’ plugin (from either Tim Kmecl or Ali Gencay) that claims to be using the unofficial Chat GPT API, with all its superpowers.

Further reading:

– official site: https://www.codegpt.co/

– GitHub CoPilot vs ChatGPT: https://dev.to/ruppysuppy/battle-of-the-giants-github-copilot-vs-chatgpt-4oac

– List of GitHub CoPilot alternatives: https://www.tabnine.com/blog/github-copilot-alternatives/

Olga on writing Mongo queries with ChatGPT

I have used ChatGPT for more effective coding. It was really helpful for example with enhancing Mongo queries for more complex use cases as it suggested specific stages that worked for a use case, which would have definitely taken me more time to research and realize which stage and/or operator is ideal for this query. 

However all the answers it produces should be checked and not used blindly. I have not yet come across a case when the answer it provided didn’t need modification (though maybe it is due to the fact that I didn’t use it for easy things). 

I have also noticed that, if a question posted to ChatGPT includes many different parameters, in a lot of cases it will not take them all to consideration so one has to continue conversation and ensure all parameters are considered in the solution.

Akos on using ChatGPT instead of StackOverflow

I have been using ChatGPT since its inception and have found it to be a valuable tool in my daily work. With ChatGPT, I no longer have to spend hours searching and Googling for regex patterns on Stack Overflow. Instead, I simply write down what I want with the regex, and the tool returns the result, saving me a significant amount of time and effort.

In addition to regex, I have also found ChatGPT to be a valuable tool when working on scrapers. Dealing with deeply nested selectors can be a challenge, and understanding how they work with scraping tools can take hours of research. But with ChatGPT, I can simply paste an example HTML and ask the tool to select what I want, saving me even more time and effort.

However, it is important to use ChatGPT in moderation. Overusing the tool could lead to a decline in my problem-solving skills and make me too dependent on it. By setting limits, I can still benefit from ChatGPT’s time-saving features while maintaining my critical thinking skills.

In conclusion, ChatGPT has been a powerful tool in my work, saving me time and effort in regex and scraping tasks. However, it is important to use it in moderation and not let it compromise my problem-solving skills.

Me, Navigating the hallucinations of chatGPT

I work with a lot of text: business documents, project proposals, contracts, articles, marketing materials, you name it. I’ve been fascinated by the emergence of “AI” tools in 2022 and both the entertainment factor and the usefulness they can provide.

All big areas – text, voice, image – had huge releases that made the internet a very interesting place again after the boring years of the web3 / crypto / nft / metaverse nonsense. Suddenly I felt the rush of discovering the internet in the early 2000’s. “Wow, I can do that?!” was a sentence that got repeated a lot this year.

The Dall-E2 / Stable Diffusion / Whisper releases were more than enough to shake up the web, but ChatGPT was the one that really disrupted the web. The first good conversational “AI” that everyone could use, and for free (for now). At times it can be so convincing that people forget that it is just a very good bullshitter. For example when I ask it about RisingStack, I get the following answer:

“RisingStack is a software development and consulting company that specializes in building scalable, high-performance systems using Node.js, Kubernetes, and other modern web technologies. The company was founded in 2013 and is headquartered in Budapest, Hungary, with additional offices in London, United Kingdom, and New York, United States. RisingStack offers a range of services, including software development, architecture design, cloud infrastructure management, performance optimization, and training. They also develop open-source tools such as Trace, a Node.js performance monitoring and debugging tool, and Pino, a Node.js logging library. RisingStack has worked with a variety of clients, ranging from small startups to large enterprises, across a range of industries, including fintech, healthcare, e-commerce, and media. They are known for their expertise in Node.js and their commitment to delivering high-quality software solutions.”

Most of this text is usable, but it contains a lot of outright lies. It is important to keep in mind that GPT hallucinates. It has no idea about what is true and what is not. It pretends to know things, but it’s just making things up word by word. In this case: RisingStack was founded in 2014, and we never had a London office. Trace was sunset like 6 years ago, and Pino has nothing to do with us.

Anyways, I find it really useful when I need to generate / rephrase / improve text. It is only valid as a Google replacement if you can confidently tell if it’s right or wrong, so “geepeetee-ing” something is not really that helpful right now.

I already used it to write contracts, blog posts (not this one though), business proposals. It also brought in new clients, as just in the past couple of weeks we used it to..

  • Automatically generate product names and descriptions for webshops
  • Create easy-read text for children with disabilities
  • Perform sentiment analysis and write answers automatically to customer reviews

Currently chatGPT has a lame writing style by default. It’s very formulaic. I’ve seen so much of it that I believe I can spot it 8 out of 10 times right away. It lies a lot, and I wasn’t able to get anything guitar-related useful out of it, despite the fact that the training material probably has a couple million tabs in it. 

Anyways, here are my not-so-hot takes to about it:

  • You really need to carefully double check everything you generate. On the surface most of it might look good enough, but that’s just making it easier for everyone to get lazy with it.
  • “AI” won’t replace jobs, instead, it will just improve productivity. As Photoshop is a better brush, GPT should be thought of as a better text/code editor. Most of the office jobs are about collaboration anyways, not typing on a keyboard.
  • Artists won’t get replaced en masse. You won’t be able to prompt an engine to generate artwork in de Goya’s style, if cave paintings are the apex of your visual art knowledge. Taste will be very important to stand out when the web gets flooded with endless mediocre “art”. Also..
  • It will be interesting to see how the “poisoning the well” problem will affect these models. The continuous retraining of the “AI” on already “AI generated” content will cause a big decline in the quality of these services, in case they won’t be able to filter them out… While they are working on making the generated content so good that it gets mistaken for genuine human creation.
  • It’s a bit scary to think about how Microsoft will dominate this space through its OpenAI investment. Despite the genius branding, it is not open at all, and will cost a lot of money without serious competitors or general access to free-to-use alternatives (like Stable Diffusion for images).
  • Most of the coverage GPT gets nowadays is about people gaming the engine to finally say something “bad”, then pretending to be offended, even more so, scared of it! This kind of AI ethics/alignment discourse is incredibly dull and boring, imho..
  • Although the adversarial aspect is very interesting. Poisoning generally available chatbots training data will be a prime trolling activity, while convincing chatbots to spill their carefully crafted secret sauce prompts is something that needs to be continuously prevented.

I was first skeptical about prompt engineering as an emerging “profession”, but seeing how building products on top of GPT3 requires proper prompting and safeguards to make the end result consistently useful for end users, I can see it happening. Also, when you build something LLM driven, you need to be aware that hostile users, trolls, competitors, etc.. will try to game your product to ramp up your cloud costs or cause reputational harm.

This tweet really gets it.

This is already happening. Most of the “AI-driven” products are just purpose-repacked custom prompters calling GPT3 through an API, with a fancy UI.

Anyways, I’m looking forward to seeing what kind of GPT driven products we’ll make for our clients, and how the internet will change in general.

I’m curious about your experience with chatGPT, so please share it with me through this short form!



AI Development Tools Compared – The Differences You’ll Need to Know

There are many different types of AI development tools available, but not all of them are created equal. Some tools are more suited for certain tasks than others, and it’s important to select the right tool for the job.

Choosing the wrong tool can lead to frustration and wasted time, so it’s important to do your research before you start coding. There are many different types of AI development tools available, so there’s sure to be one that fits your needs. Common types of AI development tools include cloud-based platforms, open source software, and low code development tools. 

Cloud-based platforms are typically the most user friendly and allow you to build sophisticated models quickly. They offer a wide variety of features, such as data analysis tools, natural language processing capabilities, automatic machine learning models creation and pre-trained models that can be used for various tasks.

Open-source software offers a great deal of flexibility and the ability to customize your AI model for specific tasks. However, using open source software requires coding knowledge and experience and is best suited for more experienced developers. 

Low code development tools allow you to create AI applications without having to write code. These tools allow developers of any skill level to quickly and easily create AI applications, eliminating the need for coding knowledge or experience. 

Of course, there are occasional overlaps, like cloud platforms using open-source technologies – but to find out all the similarities and differences, we’ll need to examine them. Let’s explore each one in further detail:

Detectron 2

Detectron 2 is Facebook’s state-of-the-art object detection and segmentation library. It features a number of pre-trained models and baselines that can be used for a variety of tasks, and it also has cuda bindings that allow it to run on gpu for even faster training. Compared to its predecessor, Detectron 2 is much faster to train and can achieve better performance on a variety of benchmarks. It is also open source and written in python, making it easy to use and extend. Overall, Detectron 2 is an excellent choice for any object detection or segmentation task.

The fact that it is built on PyTorch makes it very easy to share models between different use cases. For example, a model that is developed for research purposes can be quickly transferred to a production environment. This makes Detectron2 ideal for organizations that need to move quickly and efficiently between different use cases. In addition, the library’s ability to handle large-scale datasets makes it perfect for organizations that need to process large amounts of data. Overall, Detectron2 is an extremely versatile tool that can be used in a variety of different settings.


Caffe is a deep learning framework for model building and optimisation. It was originally focused on vision applications, but it is now branching out into other areas such as sequences, reinforcement learning, speech, and text. Caffe is written in C++ and CUDA, with interfaces for python and mathlab. The community has built a number of models which are available at https://github.com/BVLC/caffe/wiki/Model-Zoo. Caffe is a powerful tool for anyone interested in deep learning.

It features fast, well-tested code and a seamless switch between CPU and GPU – meaning that if you don’t have a GPU that supports CUDA, it automatically defaults to the CPU. This makes it a versatile tool for deep learning researchers and practitioners. The Caffe framework is also open source, so anyone can contribute to its development.

Caffe offers the model definitions, optimization settings, and pre-trained weights so you can start right away. The BVLC models are licensed for unrestricted use, so you can use them in your own projects without any restrictions.


Keras is a deep learning framework that enables fast experimentation. It is based on Python and supports multiple backends, including TensorFlow, CNTK, and Theano. Keras includes specific tools for computer vision (KerasCV) and natural language processing (KerasNLP). Keras is open source and released under the MIT license.

The idea behind Keras is to provide a consistent interface to a range of different neural network architectures, allowing for easy and rapid prototyping. It is also possible to run Keras models on top of other lower-level frameworks such as MXNet, Deeplearning4j, TensorFlow or Theano. Keras, like other similar tools, has the advantage of being able to run on both CPU and GPU devices with very little modification to the code.

In addition, Keras includes a number of key features such as support for weight sharing and layer reuse, which can help to improve model performance and reduce training time.


The CUDA toolkit is a powerful set of tools from NVIDIA for running code on GPUs. It includes compilers, libraries, and other necessary components for developing GPU-accelerated applications. The toolkit supports programming in Python, C, and C++, and it makes it easy to take advantage of the massive parallel computing power of GPUs. With the CUDA toolkit, you can accelerate your code to run orders of magnitude faster than on a CPU alone. Whether you’re looking to speed up machine learning algorithms or render complex 3D graphics, the CUDA toolkit can help you get the most out of your NVIDIA GPUs.

In the context of fraud detection, the CUDA toolkit can be used to train graph neural networks (GNNs) on large datasets in an efficient manner. This allows GNNs to learn from more data, which can lead to improved performance. In addition, the CUDA toolkit can be used to optimize the inference process, which is important for real-time applications such as fraud detection, which is a critical application for machine learning. Many techniques struggle with fraud detection because they cannot easily identify patterns that span multiple transactions. However, GNNs are well-suited to this task due to their ability to aggregate information from the local neighborhood of a transaction. This enables them to identify larger patterns that may be missed by traditional methods.


TensorFlow is an open-source platform for machine learning that offers a full pipeline from model building to deployment. It has a large collection of pre-trained models and supports a broad range of programming languages including Javascript, Python, Android, Swift, C++, and Objective C. TensorFlow uses the Keras API and also supports CUDA for accelerated training on NVIDIA GPUs. In addition to providing tools for developers to build and train their own models, TensorFlow also offers a wide range of resources such as tutorials and guides.

TensorFlow.js is a powerful tool that can be used to solve a variety of problems. In the consumer packaged goods (CPG) industry, one of the most common problems is real-time and offline SKU detection. This problem is often caused by errors in manually inputting data, such as when a product is scanned at a store or when an order is placed online. TensorFlow.js can be used to create a solution that would automatically detect and correct these errors in real time, as well as provide offline support for cases where a connection is not available. This can greatly improve the efficiency of the CPG industry and reduce the amount of waste caused by incorrect data input.


PyTorch is a powerful machine learning framework that allows developers to create sophisticated applications for computer vision, audio processing, and time series analysis. The framework is based on the popular Python programming language, and comes with a large number of libraries and frameworks for easily creating complex models and algorithms. PyTorch also supports bindings for c++ and java, making it a great option for cross-platform development. In addition, the framework includes CUDA support for accelerated computing on NVIDIA GPUs. And finally, PyTorch comes with a huge collection of pre-trained models that can be used for quickly building sophisticated applications.

PyTorch’s ease of use and flexibility make it a popular choice for researchers and developers alike. The PyTorch framework is known to be convenient and flexible, with examples covering reinforcement learning, image classification, and natural language processing as the more common use cases. As a result, it is no surprise that the framework has been gaining popularity in recent years. Thanks to its many features and benefits, PyTorch looks poised to become the go-to framework for deep learning in the years to come.

Apache MXNet

MXNet is an open-source deep learning framework that allows you to define, train, and deploy deep neural networks on a wide array of devices, from cloud infrastructure to mobile devices. It’s scalable, allowing for fast model training, and supports a flexible programming model and multiple languages.

It’s built on a dynamic dependency scheduler that automatically parallelizes both symbolic and imperative operations on the fly. A graph optimization layer makes symbolic execution fast and memory efficient.

The MXNet library is portable and lightweight. It’s accelerated with the NVIDIA Pascal™ GPUs and scales across multiple GPUs and multiple nodes, allowing you to train models faster. Whether you’re looking to build state-of-the-art models for image classification, object detection, or machine translation, MXNet is the tool for you.


Horovod is a distributed training framework for deep learning that supports TensorFlow, Keras, PyTorch, and Apache MXNet. It is designed to make distributed training easy to use and efficient. Horovod uses a message passing interface to communicate between nodes, and each node runs a copy of the training script. The framework handles the details of communication and synchronization between nodes so that users can focus on their model. Horovod also includes a number of optimizations to improve performance, such as automatically fusing small tensors together and using hierarchical allreduce to reduce network traffic.

For Uber’s data scientists, the process of installing TensorFlow was made even more challenging by the fact that different teams were using different releases of the software. The team wanted to find a way to make it easier for all teams to use the ring-allreduce algorithm, without requiring them to upgrade to the latest version of TensorFlow or apply patches to their existing versions. The solution was to create a stand-alone package called Horovod. This package allowed the team to cut the time required to install TensorFlow from about an hour to a few minutes, depending on the hardware. As a result, Horovod has made it possible for Uber’s data scientists to spend less time installing software and more time doing what they do best.

Oracle AI

Oracle AI is a suite of artificial intelligence services that can be used to build, train and deploy models. The services include natural language processing, chat bots / customer support, text-to-speech, speech-to-text, object detection for images and data mining. Oracle AI offers pre-configured vms with access to GPUs. The service can be used to build models for anomaly detection, analytics and data mining. Oracle AI is a powerful tool that can be used to improve your business.

Children’s Medical Research Institute (CMRI) is a not-for-profit organisation dedicated to improving the health of children through medical research. CMRI moved to Oracle Cloud Infrastructure (OCI) as its preferred cloud platform. This move has helped the institute take advantage of big data and machine learning capabilities to automate routine database tasks, database consolidation, operational reporting, and batch data processing. Overall, the switch to OCI has been a positive move for CMRI, and one that is sure to help the institute continue its important work.


H2O is a powerful open source AI platform that is used by companies all over the world to improve their customer support, marketing, and data mining efforts. The software provides a wide range of features that make it easy to collect and analyze customer data, identify anomalies, and create chat bots that can provide an engaging customer experience. H2O is constantly evolving, and the company behind it is always introducing new features and improvements.

For example, it can be used to create an intelligent cash management system that predicts cash demand and helps to optimize ATM operations. It can also help information security teams reduce risk by identifying potential threats and vulnerabilities in real time. In addition, H2O.AI can be used to transform auditing from quarterly to real-time, driving audit quality, accuracy and reliability.

Alibaba Cloud

Alibaba Cloud is a leading provider of cloud computing services. Its products include machine learning, natural language processing, data mining, and analytics. Alibaba Cloud’s machine learning platform offers a variety of pre-created algorithms that can be used for tasks such as data mining, anomaly detection, and predictive maintenance. The platform also provides tools for training and deploying machine learning models. Alibaba Cloud’s natural language processing products offer APIs for text analysis, voice recognition, and machine translation. The company’s data mining and analytics products provide tools for exploring and analyzing data. Alibaba Cloud also offers products for security, storage, and networking.

Alibaba, the world’s largest online and mobile commerce company, uses intelligent recommendation algorithms to drive sales using personalized customer search suggestions on its Tmall homepage and mobile app. The system takes into account a customer’s purchase history, browsing behavior, and social interactions when making recommendations. Alibaba has found that this approach leads to increased sales and higher customer satisfaction. In addition to search suggestions, the system also provides personalized product recommendations to customers based on their past behavior. This has resulted in increased sales and engagement on the platform. Alibaba is constantly tweaking and improving its algorithms to ensure that it is providing the most relevant and useful data to its users.

IBM Watson

IBM Watson is a powerful artificial intelligence system that has a range of applications in business and industry. One of the most important functions of Watson is its ability to process natural language. This enables it to understand human conversation and respond in a way that sounds natural. This capability has been used to develop chatbots and customer support systems that can replicate human conversation. In addition, Watson’s natural language processing capabilities have been used to create marketing campaigns that can target specific demographics. Another key application of Watson is its ability to detect anomalies. This makes it an essential tool for monitoring systems and identifying potential problems. As a result, IBM Watson is a versatile and valuable artificial intelligence system with a wide range of applications.

IBM Watson is employed in nearly every industry vertical, as well as in specialized application areas such as cybersecurity. This technology is often used by a company’s data analytics team, but Watson has become so user friendly that it is also easily used by end users such as physicians or marketers.

Azure AI

Azure AI is a suite of services from Microsoft that helps you build, optimize, train, and deploy models. You can use it for object detection in images and video, natural language processing, chatbots and customer support, text-to-speech, speech-to-text, data mining and analytics, and anomaly detection. Azure AI also provides pre-configured virtual machines so you can get started quickly and easily. Whether you’re an experienced data scientist or just getting started with machine learning, Azure AI can help you achieve your goals.

With the rapid pace of technological advancement, it is no surprise that the aviation industry is constantly evolving. One of the leading companies at the forefront of this change is Airbus. The company has unveiled two new innovations that utilize Azure AI solutions to revolutionize pilot training and predict aircraft maintenance issues.

Google AI

Google AI is a broad set of tools and services that helps you build, deploy, and train models, as well as to take advantage of pre-trained models. You can use it to detect objects in images and video, to perform natural language processing tasks such as chat bots or customer support, to translate text, and to convert text-to-speech or speech-to-text. Additionally, Google AI can be used for data mining and analytics, as well as for anomaly detection. All of these services are hosted on Google Cloud Platform, which offers a variety of options for GPU-accelerated computing, pre-configured virtual machines, and TensorFlow hosting.

UPS and Google Cloud Platform were able to develop routing software that has had a major impact on the company’s bottom line. The software takes into account traffic patterns, weather conditions, and the location of UPS facilities, in order to calculate the most efficient route for each driver. As a result, UPS has saved up to $400 million a year, and reduced its fuel consumption by 10 million gallons. In addition, the software has helped to improve customer satisfaction by ensuring that packages are delivered on time.


Amazon Web Services offers a variety of AI services to help developers create intelligent applications. With pre-trained models for common use cases, AWS AI makes it easy to get started with machine learning. For images and video, the object detection service provides accurate labels and coordinates. Natural language processing can be used for chat bots and customer support, as well as translation. Text-to-speech and speech-to-text are also available. AI powered search provides relevant results from your data. Pattern recognition can be used for code review and monitoring. And data mining and analytics can be used for anomaly detection. AWS AI also offers hosted GPUs and pre-configured vms. With so many powerful features, Amazon Web Services is the perfect platform for developing AI applications.

Formula 1 is the world’s most popular motorsport, with hundreds of millions of fans worldwide. The sport has been at the forefront of technological innovation for decades, and its use of data and analytics has been central to its success. Teams have long used on-premises data centers to store and process large amounts of data, but the sport is now accelerating its transformation to the cloud. Formula 1 is moving the vast majority of its infrastructure to Amazon Web Services (AWS), and standardizing on AWS’s machine-learning and data-analytics services. This will enable Formula 1 to enhance its race strategies, data tracking systems, and digital broadcasts through a wide variety of AWS services—including Amazon SageMaker, AWS Lambda, and AWS’s event-driven serverless computing service. By using these services, Formula 1 will be able to deliver new race metrics that will change the way fans and teams experience racing.


Choosing the right AI development tool can be difficult. This article has provided a comparison of some of the most popular tools on the market. Each tool has its own strengths and weaknesses, so it is important to decide which one will best suit your needs.

Kubernetes Interview Questions and Answers You’ll Need the Most

Are you currently preparing for a Kubernetes interview? If so, you’ll want to make sure you’re familiar with the questions and answers below at least. This article will help you demonstrate your understanding of Kubernetes concepts and how they can be applied in practice. With enough preparation, you’ll be able to confidently nail your next interview and showcase your Kubernetes skills. Let’s get started!

What is Kubernetes?

Kubernetes is a platform for managing containerized stateless or stateful applications across a cluster of nodes. Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications. It groups containers that make up an application into logical units for easy management and discovery. Kubernetes also automates the replication of the containers across multiple nodes in a cluster, as well as healing of failed containers. Kubernetes was originally designed by Google, and is now maintained by the Cloud Native Computing Foundation.

Some of the key features of Kubernetes include:

– Provisioning and managing containers across multiple hosts

– Scheduling and deploying containers

– Orchestrating containers as part of a larger application

– Automated rollouts and rollbacks

– Handling container health and failure

– Scaling containers up and down as needed

– It has a large and active community that develops new features and supports users.

– It has a variety of tools for managing storage and networking for containers. 

What are the main differences between Docker Swarm and Kubernetes?

Docker Swarm and Kubernetes are both container orchestration platforms. They are both designed for deploying and managing containers at scale. However, there are some key differences between the two platforms.

Docker Swarm is a native clustering solution for Docker. It is simpler to install and configure than Kubernetes. Docker Swarm also uses the same CLI and API as Docker, so it is easy to learn for users who are already familiar with Docker. However, Docker Swarm lacks some of the advanced features that Kubernetes has, such as automatic rollouts and rollbacks, health checks, and secrets management.

Kubernetes is a more complex system than Docker Swarm, but it offers a richer feature set. Kubernetes is also portable across different environments, so it can be used in on-premise deployments, as well as cloud-based deployments. In addition, Kubernetes is backed by a large community of users and developers, so there is a wealth of support and documentation available.

To sum up:

-Kubernetes is more complicated to set up but the benefits are a robust cluster and auto-scaling 

-Docker Swarm is easy to set up but does not have a robust cluster or autoscaling 

What is a headless service?

​​A headless service is a special type of Kubernetes service that does not expose a cluster IP address. This means that the service will not provide load balancing to the associated pods. Headless services are useful for applications that require a unique IP per instance or for applications that do not require load balancing. For example, stateful applications such as databases often require a unique IP address per instance. By using a headless service, each instance can be given its own IP address without the need for a load balancer. Headless services can also be used to expose individual instances of an application outside of the Kubernetes cluster. This is often done by using a tool like kubectl to expose individual pods.

What are the main components of Kubernetes architecture?

Pods and containers are two components of a Kubernetes architecture. Pods are composed of one or more containers that share an IP address and port space. This means that containers within a pod can communicate with each other without going through a network. Pods also provide a way to deploy applications on a cluster in a replicable and scalable way. Containers, on the other hand, are isolated from each other and do not share an IP address. This isolation provides a higher level of security as each container can only be accessed by its own process. In addition, containers have their own file system, which means that they can be used to package up an application so that it can be run in different environments.

What are the different management and orchestrator features in Kubernetes?

The available management and orchestrator features in Kubernetes are: 

1. Cluster management components: These components manage the Kubernetes cluster.

2. Container orchestration components: These components orchestrate the deployment and operation of containers.

3. Scheduling components: These components schedule and manage the deployment of containers on nodes in the cluster.

4. Networking components: These components provide networking capabilities for containers in the cluster.

5. Storage components: These components provide storage for containers in the cluster.

6. Security components: These components provide security for the containers in the cluster.

What is the load balancer in Kubernetes?

A load balancer is a software program that evenly distributes network traffic across a group of servers. It is used to improve the performance and availability of applications that run on multiple servers.

Specifically, the load balancer in Kubernetes is a component that distributes traffic across nodes in a Kubernetes cluster. It can be used to provide high availability and to optimize resource utilization. Also, the load balancer can help to prevent overloads on individual nodes. 

What is Container resource monitoring?

Container resource monitoring means that you can keep track of CPU, Memory, and Disk space utilization for each container in your Kubernetes cluster. There are a two main ways to monitor the Kubernetes cluster. One way is to use the built-in kubectl command-line interface: this is able to monitor CPU utilization, memory usage and disk space. If you need to keep track of more data, then there’s another way: to use a third-party monitoring tool such as Datadog, New Relic, or Prometheus. 

What is the difference between a ReplicaSet and replication controller?

In Kubernetes, a ReplicaSet is a collection of pods that are always up and running. The replication controller’s objectives are to ensure that a desired number of pod replicas are running at all times, and to maintain the desired state of the pods in the system.

A ReplicaSet is a newer, more advanced concept that replaces replication controllers. A ReplicaSet allows you to define a minimum number of pods that must be up and running at all times, and provides a richer set of features than replication controllers.

ReplicaSets are the basic building blocks of Kubernetes clusters. They provide the ability to have multiple copies of an application running in parallel, and to scale out (add more nodes) or scale in (remove nodes) the number of copies as needed. Replication controllers provide the ability to maintain a desired number of pod replicas for a particular application.

A ReplicaSet ensures that a specified number of pod replicas are running at any given time. However, a Deployment is a higher-level concept that manages ReplicaSets and provides declarative updates to Pods along with a lot of other useful features. Therefore, we recommend using Deployments instead of directly using ReplicaSets, unless you require custom update orchestration or don’t require updates at all.

What are the recommended security measures for Kubernetes?

There are a number of recommended security measures for Kubernetes, including implementing third-party authentication and authorization tools, using network segmentation to restrict access to sensitive data, and maintaining regular monitoring and auditing of the cluster.

Another key recommendation is to use role-based access control (RBAC) to limit access to the Kubernetes API. This ensures that only authorized users can make changes to the system and introduces an additional layer of protection against potential vulnerabilities or attacks.

Node isolation is also worth mentioning. It is a process of isolating individual nodes in a Kubernetes cluster so that each node only has access to its own resources. This process is used to improve the security and performance of Kubernetes clusters by preventing malicious activity on one node from affecting other nodes. Node isolation can be achieved through a variety of means, such as using a firewall to block network traffic between nodes, or using software-defined networking to segment node traffic. By isolating nodes, Kubernetes administrators can ensure that each node in a cluster is used only for its intended purpose and that unauthorized access to resources is prevented.

Other best practices for securing Kubernetes include: 

– Restricting access to the Kubernetes API to authorized users only

– Using network firewalls to restrict access to the Kubernetes nodes from unauthorized users

– Using intrusion detection/prevention systems to detect and prevent unauthorized access to the Kubernetes nodes

– Using encryption for communications between the nodes and pods in the cluster

– Limiting which IP addresses have access to cluster resources

– Implementing regular vulnerability assessments.

Ultimately, incorporating these types of security measures into your Kubernetes deployment will help ensure the safety and integrity of your system.

What is Container Orchestration and how does it work in Kubernetes?

Container orchestration is the process of managing a group of containers as a single entity. Container orchestration systems, like Kubernetes, allow you to deploy and manage containers across a cluster of nodes. This provides a higher-level of abstraction and makes it easier to manage and scale your applications.

Kubernetes supports features for container orchestration, including:

– Creating and managing containers

– Configuring and managing networking

– Configuring and managing storage

– Booting and managing VMs

– Deploying applications

– Managing workloads

– Accessing logs and monitoring resources

– Configuring security and authentication

What are the features of Kubernetes?

Kubernetes is a platform that enables users to deploy, manage and scale containerized applications. Some of its key features include:

-Declarative syntax: Kubernetes uses a declarative syntax that makes it easy to describe the desired state of an application.

-Self-healing: Kubernetes is able to automatically heal applications and nodes in the event of failures.

-Horizontal scalability: Kubernetes enables users to scale their applications horizontally, by adding or removing nodes as needed.

-Fault tolerance: Kubernetes is able to tolerate failures of individual nodes or pods, ensuring that applications are always available.

What is Kube-apiserver and what’s the role of it?

The Kubernetes apiserver is a critical part of a Kubernetes deployment.

The apiserver provides a REST API for managing Kubernetes resources.

It also provides authentication and authorization for accessing those resources.

The apiserver must be secured to prevent unauthorized access to Kubernetes resources.

Use role-based access control to restrict access to specific resources.

What is a node in Kubernetes?

A node is a master or worker machine in Kubernetes. It can be a physical machine or a virtual machine.

A node is a member of a Kubernetes cluster. Each node in a Kubernetes cluster is assigned a unique ID, which is used to identify the node when communicating with the Kubernetes API.

When a new node is added to a Kubernetes cluster, the Kubernetes API is contacted to register the node with the cluster. The Kubernetes API stores information about the node, including its assigned ID, the addresses of the node’s Kubernetes masters, and the labels assigned to the node.

When a node is removed from a Kubernetes cluster, the Kubernetes API is contacted to unregister the node from the cluster. The Kubernetes API removes information about the node from its database, including the node’s assigned ID, the addresses of the node’s Kubernetes masters, and the labels assigned to the node.

What is kube-scheduler and what’s the role of it?

Kube-scheduler is responsible for keeping track of the state of the cluster and ensuring that all desired pods are scheduled.

In a Kubernetes cluster, the scheduler is responsible for assigning Pods to Nodes. 

When a new Pod is created, the scheduler watches for it and becomes responsible for finding the best Node for that Pod to run on. To do this, the scheduler looks at the requirements of the Pod and compares them with the capabilities of the Nodes in the cluster. The scheduler also takes into account factors such as Node utilization and available resources. By finding the best match between Pods and Nodes, the scheduler helps to ensure that Pods are running on an optimal Node. This, in turn, helps to improve the performance of the overall cluster.

To get the most out of the Kubernetes scheduler, you should configure it to schedule your pods as efficiently as possible. You can do this by configuring the scheduler’s resource constraints and pod priorities.

What is Minikube?

Minikube is important because it allows you to have a local Kubernetes environment. Minikube is a single node Kubernetes environment that you can install on your laptop. This is important because it allows you to develop and test Kubernetes applications without having to deploy them to a cluster.

What is a Namespace in Kubernetes?

Namespaces are a way to logically group objects in Kubernetes. By default, Kubernetes has a single namespace. Objects in different namespaces can have different security contexts and can be managed independently.

How can you handle incoming data from external sources (ingress traffic)?

Ingress is a Kubernetes resource that allows an organization to control how external traffic is routed to and from its services. Ingress resources are defined in a YAML file. An Ingress controller is then deployed to manage the ingress resource.

Ingress controllers use the Ingress Resource Definition to determine how to route traffic to services.

Ingress controllers can use a variety of methods to route traffic, including:

-Using a load balancer

-Using a DNS server

-Using a path-based routing algorithm

What are federated clusters?

Federated clusters in Kubernetes allow multiple Kubernetes clusters to be interconnected, forming a larger mesh of clusters. This allows for greater scale and redundancy, as well as simplified management of multiple clusters.

Federated clusters are configured by setting up a federated control plane, and then adding other Kubernetes clusters to the federated control plane. The federated control plane can be used to manage the other Kubernetes clusters in a number of ways, including:

  • The nodes in the other clusters
  • The Pods in the other clusters
  • The Services in the other clusters
  • The Secrets in the other clusters
  • The ConfigMaps in the other clusters
  • The Deployments in the other clusters
  • The ReplicationControllers in the other clusters
  • The Ingresses in the other clusters
  • The LoadBalancers in the other clusters

What is a Kubelet?

Kubelet is a daemon on each node that runs on each Kubernetes node. Kubelet is responsible for communicating with the API server to get information about the state of the nodes and pods in the cluster, and for pulling and pushing images to and from the nodes.

What is Kubectl?

Kubectl is a command-line interface for Kubernetes. With Kubectl, you can manage your Kubernetes clusters and applications. Kubectl can be used on your local machine, or you can use it with a Kubernetes cluster. kubectl can be used to create, delete, and manage Kubernetes objects.

What is Kube-proxy? 

Kube-proxy is a daemon that runs on each Kubernetes node. It is responsible for proxying pod IPs and service IPs to the correct pods and services.  Kube-proxy is started automatically by Kubernetes.  Kubernetes also uses kube-proxy to load balance services. 

What are “K8s”? 

k8s is an abbreviation for Kubernetes.

How are Kubernetes and Docker related?

Kubernetes is a platform for managing containers at scale, while Docker itself is a container technology that can be used by Kubernetes.

A container infrastructure, such as Docker, allows apps to be packaged into lightweight, portable, and self-sufficient units. Kubernetes is a platform for managing and orchestrating containers at scale. Along with Kubernetes, Docker gives you the ability to deploy and manage applications at large scales.


The interview process can be daunting, but by preparing for the most commonly asked questions and understanding the basics of what Kubernetes is and does, you’ll be well on your way to acing your interview. We wish you the best of luck in your upcoming interview!

RedwoodJS vs. BlitzJS: The Future of Fullstack JavaScript Meta-Frameworks

Redwood and Blitz are two up-and-coming full-stack meta-frameworks that provide tooling for creating SPAs, server-side rendered pages, and statically generated content, providing a CLI to generate end-to-end scaffolds. I’ve been waiting for a worthy Rails replacement in JavaScript since who-knows-when. This article is an overview of the two, and while I’ve given more breadth to Redwood (as it differs from Rails a great deal), I personally prefer Blitz.

As the post ended up being quite lengthy, below, we provide a comparison table for the hasty ones.

A bit of history first

If you started working as a web developer in the 2010s, you might not have even heard of Ruby on Rails, even though it gave us apps like Twitter, GitHub, Urban Dictionary, Airbnb, and Shopify. Compared to the web frameworks of its time, it was a breeze to work with. Rails broke the mold of web technologies by being a highly opinionated MVC tool, emphasizing the use of well-known patterns such as convention over configuration and DRY, with the addition of a powerful CLI that created end-to-end scaffolds from model to the template to be rendered. Many other frameworks have built on its ideas, such as Django for Python, Laravel for PHP, or Sails for Node.js. Thus, arguably, it is a piece of technology just as influential as the LAMP stack before its time.

However, the fame of Ruby on Rails has faded quite a bit since its creation in 2004. By the time I started working with Node.js in 2012, the glory days of Rails were over. Twitter — built on Rails — was infamous for frequently showcasing its fail whale between 2007 and 2009. Much of it was attributed to the lack of Rails’ scalability, at least according to word of mouth in my filter bubble. This Rails bashing was further reinforced when Twitter switched to Scala, even though they did not completely ditch Ruby then.

The scalability issues of Rails (and Django, for that matter) getting louder press coverage coincided with the transformation of the Web too. More and more JavaScript ran in the browser. Webpages became highly interactive WebApps, then SPAs. Angular.js revolutionized that too when it came out in 2010. Instead of the server rendering the whole webpage by combining the template and the data, we wanted to consume APIs and handle the state changes by client-side DOM updates.

Thus, full-stack frameworks fell out of favor. Development got separated between writing back-end APIs and front-end apps. And these apps could have meant Android and iOS apps too by that time, so it all made sense to ditch the server-side rendered HTML strings and send over the data in a way that all our clients could work with.

UX patterns developed as well. It wasn’t enough anymore to validate the data on the back-end, as users need quick feedback while they’re filling out bigger and bigger forms. Thus, our life got more and more complicated: we needed to duplicate the input validations and type definitions, even if we wrote JavaScript on both sides. The latter got simpler with the more widespread (re-)adoption of monorepos, as it got somewhat easier to share code across the whole system, even if it was built as a collection of microservices. But monorepos brought their own complications, not to mention distributed systems.

And ever since 2012, I have had a feeling that whatever problem we solve generates 20 new ones. You could argue that this is called “progress”, but maybe merely out of romanticism, or longing for times past when things used to be simpler, I’ve been waiting for a “Node.js on Rails” for a while now. Meteor seemed like it could be the one, but it quickly fell out of favor, as the community mostly viewed it as something that is good for MVPs but does not scale… The Rails problem all over again, but breaking down at an earlier stage of the product lifecycle. I must admit, I never even got around to try it.

However, it seemed like we were getting there slowly but steadily. Angular 2+ embraced the code generators á la Rails, alongside with Next.js, so it seemed like it could be something similar. Next.js got API Routes, making it possible to handle the front-end with SSR and write back-end APIs too. But it still lacks a powerful CLI generator and has nothing to do with the data layer either. And in general, a good ORM was still missing from the equation to reach the power level of Rails. At least this last point seems to be solved with Prisma being around now.

Wait a minute. We have code generators, mature back-end and front-end frameworks, and finally, a good ORM. Maybe we have all pieces of the puzzle in place? Maybe. But first, let’s venture a bit further from JavaScript and see if another ecosystem has managed to further the legacy of Rails, and whether we can learn from it.

Enter Elixir and Phoenix

Elixir is a language built on Erlang’s BEAM and OTP, providing a nice concurrency model based on the actor model and processes, which also results in easy error handling due to the “let it crash” philosophy in contrast to defensive programming. It also has a nice, Ruby-inspired syntax, yet remains to be an elegant, functional language.

Phoenix is built on top of Elixir’s capabilities, first as a simple reimplementation of Rails, with a powerful code generator, an data mapping toolkit (think ORM), good conventions, and generally good dev experience, with the inbuilt scalability of the OTP.

Yeah.. So far, I wouldn’t have even raised an eyebrow. Rails got more scalable over time, and I can get most of the things I need from a framework writing JavaScript these days, even if wiring it all up is still pretty much DIY. Anyhow, if I need an interactive browser app, I’ll need to use something like React (or at least Alpine.js) to do it anyway.

Boy, you can’t even start to imagine how wrong the previous statement is. While Phoenix is a full-fledged Rails reimplementation in Elixir, it has a cherry on top: your pages can be entirely server-side rendered and interactive at the same time, using its superpower called LiveView. When you request a LiveView page, the initial state gets prerendered on the server side, and then a WebSocket connection is built. The state is stored in memory on the server, and the client sends over events. The backend updates the state, calculates the diff, and sends over a highly compressed changeset to the UI, where a client-side JS library updates the DOM accordingly.

I heavily oversimplified what Phoenix is capable of, but this section is already getting too long, so make sure to check it out yourself!

We’ve taken a detour to look at one of the best, if not the best full-stack frameworks out there. So when it comes to full-stack JavaScript frameworks, it only makes sense to achieve at least what Phoenix has achieved. Thus, what I would want to see:

  1. A CLI that can generate data models or schemas, along with their controllers/services and their corresponding pages
  2. A powerful ORM like Prisma
  3. Server-side rendered but interactive pages, made simple
  4. Cross-platform usability: make it easy for me to create pages for the browser, but I want to be able to create an API endpoint responding with JSON by just adding a single line of code.
  5. Bundle this whole thing together

With that said, let’s see whether Redwood or Blitz is the framework we have been waiting for.

BlitzJS vs. RedwoodJS comparison

What is RedwoodJS?

Redwood markets itself as THE full-stack framework for startups. It is THE framework everyone has been waiting for, if not the best thing since the invention of sliced bread. End of story, this blog post is over.

At least according to their tutorial.

I felt a sort of boastful overconfidence while reading the docs, which I personally find difficult to read. The fact that it takes a lighter tone compared to the usual, dry, technical texts is a welcome change. Still, as a text moves away from the safe, objective description of things, it also wanders into the territory of matching or clashing with the reader’s taste. 

In my case, I admire the choice but could not enjoy the result.

Still, the tutorial is worth reading through. It is very thorough and helpful. The result is also worth the… well, whatever you feel while reading it, as Redwood is also nice to work with. Its code generator does what I would expect it to do. Actually, it does even more than I expected, as it is very handy not just for setting up the app skeleton, models, pages, and other scaffolds. It even sets your app up to be deployed to different deployment targets like AWS Lambdas, Render, Netlify, Vercel.

Speaking of the listed deployment targets, I have a feeling that Redwood pushes me a bit strongly towards serverless solutions, Render being the only one in the list where you have a constantly running service. And I like that idea too: if I have an opinionated framework, it sure can have its own opinions about how and where it wants to be deployed. As long as I’m free to disagree, of course.

But Redwood has STRONG opinions not just about the deployment, but overall on how web apps should be developed, and if you don’t agree with those, well…

I want you to use GraphQL

Let’s take a look at a freshly generated Redwood app. Redwood has its own starter kit, so we don’t need to install anything, and we can get straight to creating a skeleton.

$ yarn create redwood-app --ts ./my-redwood-app

You can omit the --ts flag if you want to use plain JavaScript instead.

Of course, you can immediately start up the development server and see that you got a nice UI already with yarn redwood dev. One thing to notice, which is quite commendable in my opinion, is that you don’t need to globally install a redwood CLI. Instead, it always remains project local, making collaboration easier.

Now, let’s see the directory structure.

├── api/
├── scripts/
├── web/
├── graphql.config.js
├── jest.config.js
├── node_modules
├── package.json
├── prettier.config.js
├── README.md
├── redwood.toml
├── test.js
└── yarn.lock

We can see the regular prettier.config.js, jest.config.js, and there’s also a redwood.toml for configuring the port of the dev-server. We have an api and web directory for separating the front-end and the back-end into their own paths using yarn workspaces.

But wait, we have a graphql.config.js too! That’s right, with Redwood, you’ll write a GraphQL API. Under the hood, Redwood uses Apollo on the front-end and Yoga on the back-end, but most of it is made pretty easy using the CLI. However, GraphQL has its downsides, and if you’re not OK with the tradeoff, well, you’re shit out of luck with Redwood.

Let’s dive a bit deeper into the API.

├── api
│   ├── db
│   │   └── schema.prisma
│   ├── jest.config.js
│   ├── package.json
│   ├── server.config.js
│   ├── src
│   │   ├── directives
│   │   │   ├── requireAuth
│   │   │   │   ├── requireAuth.test.ts
│   │   │   │   └── requireAuth.ts
│   │   │   └── skipAuth
│   │   │       ├── skipAuth.test.ts
│   │   │       └── skipAuth.ts
│   │   ├── functions
│   │   │   └── graphql.ts
│   │   ├── graphql
│   │   ├── lib
│   │   │   ├── auth.ts
│   │   │   ├── db.ts
│   │   │   └── logger.ts
│   │   └── services
│   ├── tsconfig.json
│   └── types
│       └── graphql.d.ts

Here, we can see some more, backend related config files, and the debut of tsconfig.json.

  • api/db/: Here resides our schema.prisma, which tells us the Redwood, of course, uses Prisma. The src/ dir stores the bulk of our logic.
  • directives/: Stores our graphql schema directives.
  • functions/: Here are the necessary lambda functions so we can deploy our app to a serverless cloud solution (remember STRONG opinions?).
  • graphql/: Here reside our gql schemas, which can be generated automatically from our db schema.
  • lib/: We can keep our more generic helper modules here.
  • services/: If we generate a page, we’ll have a services/ directory, which will hold our actual business logic.

This nicely maps to a layered architecture, where the GraphQL resolvers function as our controller layer. We have our services, and we can either create a repository or dal layer on top of Prisma, or if we can keep it simple, then use it as our data access tool straight away.

So far so good. Let’s move to the front-end.

├── web
│   ├── jest.config.js
│   ├── package.json
│   ├── public
│   │   ├── favicon.png
│   │   ├── README.md
│   │   └── robots.txt
│   ├── src
│   │   ├── App.tsx
│   │   ├── components
│   │   ├── index.css
│   │   ├── index.html
│   │   ├── layouts
│   │   ├── pages
│   │   │   ├── FatalErrorPage
│   │   │   │   └── FatalErrorPage.tsx
│   │   │   └── NotFoundPage
│   │   │       └── NotFoundPage.tsx
│   │   └── Routes.tsx
│   └── tsconfig.json

From the config file and the package.json, we can deduce we’re in a different workspace. The directory layout and file names also show us that this is not merely a repackaged Next.js app but something completely Redwood specific.

Redwood comes with its router, which is heavily inspired by React Router. I found this a bit annoying as the dir structure-based one in Next.js feels a lot more convenient, in my opinion.

However, a downside of Redwood is that it does not support server-side rendering, only static site generation. Right, SSR is its own can of worms, and while currently you probably want to avoid it even when using Next, with the introduction of Server Components this might soon change, and it will be interesting to see how Redwood will react (pun not intended).

On the other hand, Next.js is notorious for the hacky way you need to use layouts with it (which will soon change though), while Redwood handles them as you’d expect it. In Routes.tsx, you simply need to wrap your Routes in a Set block to tell Redwood what layout you want to use for a given route, and never think about it again.

import { Router, Route, Set } from "@redwoodjs/router";
import BlogLayout from "src/layouts/BlogLayout/";

const Routes = () => {
  return (
      <Route path="/login" page={LoginPage} name="login" />
      <Set wrap={BlogLayout}>
        <Route path="/article/{id:Int}" page={ArticlePage} name="article" />
        <Route path="/" page={HomePage} name="home" />
      <Route notfound page={NotFoundPage} />

export default Routes;

Notice that you don’t need to import the page components, as it is handled automatically. Why can’t we also auto-import the layouts though, as for example Nuxt 3 would? Beats me.

Another thing to note is the /article/{id:Int} part. Gone are the days when you always need to make sure to convert your integer ids if you get them from a path variable, as Redwood can convert them automatically for you, given you provide the necessary type hint.

Now’s a good time to take a look at SSG. The NotFoundPage probably doesn’t have any dynamic content, so we can generate it statically. Just add prerender, and you’re good.

const Routes = () => {
  return (
      <Route notfound page={NotFoundPage} prerender />

export default Routes;

You can also tell Redwood that some of your pages require authentication. Unauthenticated users should be redirected if they try to request it.

import { Private, Router, Route, Set } from "@redwoodjs/router";
import BlogLayout from "src/layouts/BlogLayout/";

const Routes = () => {
  return (
      <Route path="/login" page={LoginPage} name="login" />
      <Private unauthenticated="login">
        <Set wrap={PostsLayout}>
      <Set wrap={BlogLayout}>
        <Route path="/article/{id:Int}" page={ArticlePage} name="article" />
        <Route path="/" page={HomePage} name="home" />
      <Route notfound page={NotFoundPage} />

export default Routes;

Of course, you need to protect your mutations and queries, too. So make sure to append them with the pre-generated @requireAuth.

Another nice thing in Redwood is that you might not want to use a local auth strategy but rather outsource the problem of user management to an authentication provider, like Auth0 or Netlify-Identity. Redwood’s CLI can install the necessary packages and generate the required boilerplate automatically.

What looks strange, however, at least with local auth, is that the client makes several roundtrips to the server to get the token. More specifically, the server will be hit for each currentUser or isAuthenticated call.

Frontend goodies in Redwood

There are two things that I really loved about working with Redwood: Cells and Forms.

A cell is a component that fetches and manages its own data and state. You define the queries and mutations it will use, and then export a function for rendering the Loading, Empty, Failure, and Success states of the component. Of course, you can use the generator to create the necessary boilerplate for you.

A generated cell looks like this:

import type { ArticlesQuery } from "types/graphql";
import type { CellSuccessProps, CellFailureProps } from "@redwoodjs/web";

export const QUERY = gql`
  query ArticlesQuery {
    articles {

export const Loading = () => <div>Loading...</div>;

export const Empty = () => <div>Empty</div>;

export const Failure = ({ error }: CellFailureProps) => (
  <div style={{ color: "red" }}>Error: {error.message}</div>

export const Success = ({ articles }: CellSuccessProps<ArticlesQuery>) => {
  return (
      {articles.map((item) => {
        return <li key={item.id}>{JSON.stringify(item)}</li>;

Then you just import and use it as you would any other component, for example, on a page.

import ArticlesCell from "src/components/ArticlesCell";

const HomePage = () => {
  return (
      <MetaTags title="Home" description="Home page" />
      <ArticlesCell />

export default HomePage;

However! If you use SSG on pages with cells — or any dynamic content really —only their loading state will get pre-rendered, which is not much of a help. That’s right, no getStaticProps for you if you go with Redwood.

The other somewhat nice thing about Redwood is the way it eases form handling, though the way they frame it leaves a bit of a bad taste in my mouth. But first, the pretty part.

import { Form, FieldError, Label, TextField } from "@redwoodjs/forms";

const ContactPage = () => {
  return (
      <Form config={{ mode: "onBlur" }}>
        <Label name="email" errorClassName="error">
            required: true,
            pattern: {
              value: /^[^@]+@[^.]+\..+$/,
              message: "Please enter a valid email address",
        <FieldError name="email" className="error" />

The TextField components validation attribute expects an object to be passed, with a pattern against which the provided input value can be validated.

The errorClassName makes it easy to set the style of the text field and its label in case the validation fails, e.g. turning it red. The validations message will be printed in the FieldError component. Finally, the config={{ mode: 'onBlur' }} tells the form to validate each field when the user leaves them.

The only thing that spoils the joy is the fact that this pattern is eerily similar to the one provided by Phoenix. Don’t get me wrong. It is perfectly fine, even virtuous, to copy what’s good in other frameworks. But I got used to paying homage when it’s due. Of course, it’s totally possible that the author of the tutorial did not know about the source of inspiration for this pattern. If that’s the case, let me know, and I’m happy to open a pull request to the docs, adding that short little sentence of courtesy.

But let’s continue and take a look at the whole working form.

import { MetaTags, useMutation } from "@redwoodjs/web";
import { toast, Toaster } from "@redwoodjs/web/toast";
import {
} from "@redwoodjs/forms";

import {
} from "types/graphql";

const CREATE_CONTACT = gql`
  mutation CreateContactMutation($input: CreateContactInput!) {
    createContact(input: $input) {

interface FormValues {
  name: string;
  email: string;
  message: string;

const ContactPage = () => {
  const formMethods = useForm();

  const [create, { loading, error }] = useMutation<
    onCompleted: () => {
      toast.success("Thank you for your submission!");

  const onSubmit: SubmitHandler<FormValues> = (data) => {
    create({ variables: { input: data } });

  return (
      <MetaTags title="Contact" description="Contact page" />

      <Toaster />
        config={{ mode: "onBlur" }}
        <FormError error={error} wrapperClassName="form-error" />

        <Label name="email" errorClassName="error">
            required: true,
            pattern: {
              value: /^[^@]+@[^.]+\..+$/,
              message: "Please enter a valid email address",
        <FieldError name="email" className="error" />

        <Submit disabled={loading}>Save</Submit>

export default ContactPage;

Yeah, that’s quite a mouthful. But this whole thing is necessary if we want to properly handle submissions and errors returned from the server. We won’t dive deeper into it now, but if you’re interested, make sure to take a look at Redwood’s really nicely written and thorough tutorial.

Now compare this with how it would look like in Phoenix LiveView.


    <%= label f, :title %>
    <%= text_input f, :title %>
    <%= error_tag f, :title %>

      <button type="submit" phx-disable-with="Saving...">Save</button>

A lot easier to see through while providing almost the same functionality. Yes, you’d be right to call me out for comparing apples to oranges. One is a template language, while the other is JSX. Much of the logic in a LiveView happens in an elixir file instead of the template, while JSX is all about combining the logic with the view. However, I’d argue that an ideal full-stack framework should allow me to write the validation code once for inputs, then let me simply provide the slots in the view to insert the error messages into, and allow me to set up the conditional styles for invalid inputs and be done with it. This would provide a way to write cleaner code on the front-end, even when using JSX. You could say this is against the original philosophy of React, and my argument merely shows I have a beef with it. And you’d probably be right to do so. But this is an opinion article about opinionated frameworks, after all, so that’s that.

The people behind RedwoodJS

Credit, where credit is due.

Redwood was created by GitHub co-founder and former CEO Tom Preston-Werner, Peter Pistorius, David Price & Rob Cameron. Moreover, its core team currently consists of 23 people. So if you’re afraid to try out newish tools because you may never know when their sole maintainer gets tired of the struggles of working on a FOSS tool in their free time, you can rest assured: Redwood is here to stay.

Redwood: Honorable mentions


  • also comes bundled with Storybook,
  • provides the must-have graphiql-like GraphQL Playground,
  • provides accessibility features out of the box like the RouteAnnouncemnet SkipNavLink, SkipNavContent and RouteFocus components,
  • of course it automatically splits your code by pages.

The last one is somewhat expected in 2022, while the accessibility features would deserve their own post in general. Still, this one is getting too long already, and we haven’t even mentioned the other contender yet.

Let’s see BlitzJS

Blitz is built on top of Next.js, and it is inspired by Ruby on Rails and provides a “Zero-API” data layer abstraction. No GraphQL, pays homage to predecessors… seems like we’re off to a good start. But does it live up to my high hopes? Sort of.

A troubled past

Compared to Redwood, Blitz’s tutorial and documentation are a lot less thorough and polished. It also lacks several convenience features: 

  • It does not really autogenerate host-specific config files.
  • Blitz cannot run a simple CLI command to set up auth providers.
  • It does not provide accessibility helpers.
  • Its code generator does not take into account the model when generating pages.

Blitz’s initial commit was made in February 2020, a bit more than half a year after Redwood’s in June 2019, and while Redwood has a sizable number of contributors, Blitz’s core team consists of merely 2-4 people. In light of all this, I think they deserve praise for their work.

But that’s not all. If you open up their docs, you’ll be greeted with a banner on top announcing a pivot.

While Blitz originally included Next.js and was built around it, Brandon Bayer and the other developers felt it was too limiting. Thus they forked it, which turned out to be a pretty misguided decision. It quickly became obvious that maintaining the fork would take a lot more effort than the team could invest.

All is not lost, however. The pivot aims to turn the initial value proposition “JavaScript on Rails with Next” into “JavaScript on Rails, bring your own Front-end Framework”. 

And I can’t tell you how relieved I am that this recreation of Rails won’t force me to use React. 

Don’t get me wrong. I love the inventiveness that React brought to the table. Front-end development has come a long way in the last nine years, thanks to React. Other frameworks like Vue and Svelte might lack behind in following the new concepts, but this also means they have more time to polish those ideas even further and provide better DevX. Or at least I find them a lot easier to work with without ever being afraid that my client-side code’s performance would grind to a standstill.

All in all, I find this turn of events a lucky blunder.

How to create a Blitz app

You’ll need to install Blitz globally (run yarn global add blitz or npm install -g blitz –legacy-peer-deps), before you create a Blitz app. That’s possibly my main woe when it comes to Blitz’s design, as this way, you cannot lock your project across all contributors to use a given Blitz CLI version and increment it when you see fit, as Blitz will automatically update itself from time to time.

Once blitz is installed, run

$ blitz new my-blitz-app

It will ask you 

  • whether you want to use TS or JS, 
  • if it should include a DB and Auth template (more on that later), 
  • if you want to use npm, yarn or pnpm to install dependencies, 
  • and if you want to use React Final Form or React Hook Form. 

Once you have answered all its questions, the CLI starts to download half of the internet, as it is customary. Grab something to drink, have a lunch, finish your workout session, or whatever you do to pass the time and when you’re done, you can fire up the server by running

$ blitz dev

And, of course, you’ll see the app running and the UI telling you to run

$ blitz generate all project name:string

But before we do that, let’s look around in the project directory.

├── app/
├── db/
├── mailers/
├── node_modules/
├── public/
├── test/
├── integrations/
├── babel.config.js
├── blitz.config.ts
├── blitz-env.d.ts
├── jest.config.ts
├── package.json
├── README.md
├── tsconfig.json
├── types.ts
└── yarn.lock

Again, we can see the usual suspects: config files, node_modules, test, and the likes. The public directory — to no one’s surprise — is the place where you store your static assets. Test holds your test setup and utils. Integrations is for configuring your external services, like a payment provider or a mailer. Speaking of the mailer, that is where you can handle your mail-sending logic. Blitz generates a nice template with informative comments for you to get started, including a forgotten password email template.

As you’d probably guessed, the app and db directories are the ones where you have the bulk of your app-related code. Now’s the time to do as the generated landing page says and run blitz generate all project name:string.

Say yes, when it asks you if you want to migrate your database and give it a descriptive name like add project.

Now let’s look at the db directory.

└── db/
    ├── db.sqlite
    ├── db.sqlite-journal
    ├── index.ts
    ├── migrations/
    │   ├── 20220610075814_initial_migration/
    │   │   └── migration.sql
    │   ├── 20220610092949_add_project/
    │   │   └── migration.sql
    │   └── migration_lock.toml
    ├── schema.prisma
    └── seeds.ts

The migrations directory is handled by Prisma, so it won’t surprise you if you’re already familiar with it. If not, I highly suggest trying it out on its own before you jump into using either Blitz or Redwood, as they heavily and transparently rely on it.

Just like in Redwood’s db dir, we have our schema.prisma, and our sqlite db, so we have something to start out with. But we also have a seeds.ts and index.ts. If you take a look at the index.ts file, it merely re-exports Prisma with some enhancements, while the seeds.ts file kind of speaks for itself.

Now’s the time to take a closer look at our schema.prisma.

// This is your Prisma schema file,
// learn more about it in the docs: https://pris.ly/d/prisma-schema

datasource db {
  provider = "sqlite"
  url      = env("DATABASE_URL")

generator client {
  provider = "prisma-client-js"

// --------------------------------------

model User {
  id             Int      @id @default(autoincrement())
  createdAt      DateTime @default(now())
  updatedAt      DateTime @updatedAt
  name           String?
  email          String   @unique
  hashedPassword String?
  role           String   @default("USER")

  tokens   Token[]
  sessions Session[]

model Session {
  id                 Int       @id @default(autoincrement())
  createdAt          DateTime  @default(now())
  updatedAt          DateTime  @updatedAt
  expiresAt          DateTime?
  handle             String    @unique
  hashedSessionToken String?
  antiCSRFToken      String?
  publicData         String?
  privateData        String?

  user   User? @relation(fields: [userId], references: [id])
  userId Int?

model Token {
  id          Int      @id @default(autoincrement())
  createdAt   DateTime @default(now())
  updatedAt   DateTime @updatedAt
  hashedToken String
  type        String
  // See note below about TokenType enum
  // type        TokenType
  expiresAt   DateTime
  sentTo      String

  user   User @relation(fields: [userId], references: [id])
  userId Int

  @@unique([hashedToken, type])

// NOTE: It's highly recommended to use an enum for the token type
//       but enums only work in Postgres.
//       See: https://blitzjs.com/docs/database-overview#switch-to-postgre-sql
// enum TokenType {
// }

model Project {
  id        Int      @id @default(autoincrement())
  createdAt DateTime @default(now())
  updatedAt DateTime @updatedAt
  name      String

As you can see, Blitz starts out with models to be used with a fully functional User management. Of course, it also provides all the necessary code in the app scaffold, meaning that the least amount of logic is abstracted away, and you are free to modify it as you see fit.

Below all the user-related models, we can see the Project model we created with the CLI, with an automatically added id, createdAt, and updatedAt files. One of the things that I prefer in Blitz over Redwood is that its CLI mimics Phoenix, and you can really create everything from the command line end-to-end. 

This really makes it easy to move quickly, as less context switching happens between the code and the command line. Well, it would if it actually worked, as while you can generate the schema properly, the generated pages, mutations, and queries always use name: string, and disregard the entity type defined by the schema, unlike Redwood. There’s already an open pull request to fix this, but the Blitz team understandably has been focusing on getting v2.0 done instead of patching up the current stable branch.

That’s it for the db, let’s move on to the app directory.

└── app
    ├── api/
    ├── auth/
    ├── core/
    ├── pages/
    ├── projects/
    └── users/

The core directory contains Blitz goodies, like a predefined and parameterized Form (without Redwood’s or Phoenix’s niceties though), a useCurrentUser hook, and a Layouts directory, as Bliz made it easy to persist layouts between pages, which will be rendered completely unnecessary with the upcoming Next.js Layouts. This reinforces further that the decision to ditch the fork and pivot to a toolkit was probably a difficult but necessary decision.

The auth directory contains the fully functional authentication logic we talked about earlier, with all the necessary database mutations such as signup, login, logout, and forgotten password, with their corresponding pages and a signup and login form component. The getCurrentUser query got its own place in the users directory all by itself, which makes perfect sense.

And we got to the pages and projects directories, where all the action happens.

Blitz creates a directory to store database queries, mutations, input validations (using zod), and model-specific components like create and update forms in one place. You will need to fiddle around in these a lot, as you will need to update them according to your actual model. This is nicely laid out though in the tutorial… Be sure to read it, unlike I did when I first tried Blitz out.

└── app/
    └── projects/
        ├── components/
        │   └── ProjectForm.tsx
        ├── mutations/
        │   ├── createProject.ts
        │   ├── deleteProject.ts
        │   └── updateProject.ts
        └── queries/
            ├── getProjects.ts
            └── getProject.ts

Whereas the pages directory won’t be of any surprise if you’re already familiar with Next.

└── app/
    └── pages/
        ├── projects/
        │   ├── index.tsx
        │   ├── new.tsx
        │   ├── [projectId]/
        │   │   └── edit.tsx
        │   └── [projectId].tsx
        ├── 404.tsx
        ├── _app.tsx
        ├── _document.tsx
        ├── index.test.tsx
        └── index.tsx

A bit of explanation if you haven’t tried Next out yet: Blitz uses file-system-based routing just like Next. The pages directory is your root, and the index file is rendered when the path corresponding to a given directory is accessed. Thus when the root path is requested, pages/index.tsx will be rendered, accessing /projects will render pages/projects/index.tsx, /projects/new will render pages/projects/new.tsx and so on. 

If a filename is enclosed in []-s, it means that it corresponds to a route param. Thus /projects/15 will render pages/projects/[projectId].tsx. Unlike in Next, you access the param’s value within the page using the <code>useParam(name: string, type?: string)</code> hook. To access the query object, use the <code>useRouterQuery(name: string)</code>. To be honest, I never really understood why Next needs to mesh together the two.

When you generate pages using the CLI, all pages are protected by default. To make them public, simply delete the [PageComponent].authenticate = true line. This will throw an AuthenticationError if the user is not logged in anyway, so if you’d rather redirect unauthenticated users to your login page, you probably want to use [PageComponent].authenticate = {redirectTo: '/login'}.

In your queries and mutations, you can use the ctx context arguments value to call ctx.session.$authorize or resolver.authorize in a pipeline to secure your data.

Finally, if you still need a proper http API, you can create Express-style handler functions, using the same file-system routing as for your pages.

A possible bright future

While Blitz had a troubled past, it might have a bright future. It is still definitely in the making and not ready for widespread adoption. The idea of creating a framework agnostic full-stack JavaScript toolkit is a versatile concept. This strong concept is further reinforced by the good starting point, which is the current stable version of Blitz. I’m looking further to see how the toolkit will evolve over time.

Redwood vs. Blitz: Comparison and Conclusion

I set out to see whether we have a Rails, or even better, Phoenix equivalent in JavaScript. Let’s see how they measured up.

1. CLI code generator

Redwood’s CLI gets the checkmark on this one, as it is versatile, and does what it needs to do. The only small drawback is that the model has to be written in file first, and cannot be generated.

Blitz’s CLI is still in the making, but that’s true about Blitz in general, so it’s not fair to judge it by what’s ready, but only by what it will be. In that sense, Blitz would win if it was fully functional (or will when it will be), as it can really generate pages end-to-end.

Verdict: Tie

2. A powerful ORM

That’s a short one. Both use Prisma, which is a powerful enough ORM.

Verdict: Tie

3. Server side rendered but interactive pages

Well, in today’s ecosystem, that might be wishful thinking. Even in Next, SSR is something you should avoid, at least until we’ll have Server Components in React.

But which one mimics this behavior the best?

Redwood does not try to look like a Rails replacement. It has clear boundaries demarcated by yarn workspaces between front-end and back-end . It definitely provides nice conventions and — to keep it charitable — nicely reinvented the right parts of Phoenix’s form handling. However, strictly relying on GraphQL feels a bit overkill. For small apps that we start out with anyway when opting to use a full-stack framework, it definitely feels awkward.

Redwood is also React exclusive, so if you prefer using Vue, Svelte or Solid, then you have to wait until someone reimplements Redwood for your favorite framework.

Blitz follows the Rails way, but the controller layer is a bit more abstract. This is understandable, though, as using Next’s file-system-based routing, a lot of things that made sense for Rails do not make sense for Blitz. And in general, it feels more natural than using GraphQL for everything. In the meantime, becoming framework agnostic makes it even more versatile than Redwood.

Moreover, Blitz is on its way to becoming framework agnostic, so even if you’d never touch React, you’ll probably be able to see its benefits in the near future.

But to honor the original criterion: Redwood provides client-side rendering and SSG (kind of), while Blitz provides SSR on top of the previous two.

Verdict: Die-hard GraphQL fans will probably want to stick with Redwood. But according to my criteria, Blitz hands down wins this one.

4. API

Blitz auto generates an API for data access that you can use if you want to, but you can explicitly write handler functions too. A little bit awkward, but the possibility is there.

Redwood maintains a hard separation between front-end and back-end, so it is trivial that you have an API, to begin with. Even if it’s a GraphQL API, that might just be way too much to engineer for your needs.

Verdict: Tie (TBH, I feel like they both suck at this the same amount.)

Bye now!

In summary, Redwood is a production-ready, React+GraphQL-based full-stack JavaScript framework made for the edge. It does not follow the patterns laid down by Rails at all, except for being highly opinionated. It is a great tool to use if you share its sentiment, but my opinion greatly differs from Redwood’s on what makes development effective and enjoyable.

Blitz, on the other hand, follows in the footsteps of Rails and Next, and is becoming a framework agnostic, full-stack toolkit that eliminates the need for an API layer.

I hope you found this comparison helpful. Leave a comment if you agree with my conclusion and share my love for Blitz. If you don’t, argue with the enlightened ones… they say controversy boosts visitor numbers.

Argo CD Kubernetes Tutorial

Usually, when devs set up a CI/CD pipeline for an application hosted on Kubernetes, they handle both the CI and CD parts in one task runner, such as CircleCI or Travis CI. These services offer push-based updates to your deployments, which means that credentials for the code repo and the deployment target must be stored with these services. This method can be problematic if the service gets compromised, e.g. as it happened to CodeShip.

Even using services such as GitLab CI and GitHub Actions requires that credentials for accessing your cluster be stored with them. If you’re employing GitOps, to take advantage of using the usual Push to repo -> Review Code -> Merge Code sequence for managing your infrastructure configuration as well, this would also mean access to your whole infrastructure.

[elementor-template id="3483"]

Luckily there are tools to help us with these issues. Two of the most known are Argo CD and Flux. They allow credentials to be stored within your Kubernetes cluster, where you have more control over their security. They also offer pull-based deployment with drift detection. Both of these tools solve the same issues, but tackle them from different angles.

Here, we’ll take a deeper look at Argo CD out of the two.

What is Argo CD

Argo CD is a continuous deployment tool that you can install into your Kubernetes cluster. It can pull the latest code from a git repository and deploy it into the cluster – as opposed to external CD services, deployments are pull-based. You can manage updates for both your application and infrastructure configuration with Argo CD. Advantages of such a setup include being able to use credentials from the cluster itself for deployments, which can be stored in secrets or a vault.


To try out Argo CD, we’ve also prepared a test project that we’ll deploy to Kubernetes hosted on DigitalOcean. You can grab the example project from our GitLab repository here: https://gitlab.com/risingstack-org/argocd-demo/

Forking the repo will allow you to make changes for yourself, and it can be set up later in Argo CD as the deployment source.

Get doctl from here:


Or, if you’re using a mac, from Homebrew:

brew install doctl

You can use any Kubernetes provider for this tutorial. The two requirements are having a Docker repository and a Kubernetes cluster with access to it. For this tutorial, we chose to go with DigitalOcean for the simplicity of its setup, but most other platforms should work just fine.

We’ll focus on using the web UI for the majority of the process, but you can also opt to use the `doctl` cli tool if you wish. `doctl` can mostly replace `kubectl` as well. `doctl` will only be needed to push our built docker image to the repo that our deployment will have access to.

Helm is a templating engine for Kubernetes. It allows us to define values separately from the structure of the yaml files, which can help with access control and managing multiple environments using the same template.

You can grab Helm here: https://github.com/helm/helm/releases

Or via Homebrew for mac users:

brew install helm

Download the latest Argo CD version from https://github.com/argoproj/argo-cd/releases/latest

If you’re using a mac, you can grab the cli tools from Homebrew:

brew install argocd

DigitalOcean Setup

After logging in, first, create a cluster using the “Create” button on the top right, and selecting Kubernetes. For the purposes of this demo, we can just go with the smallest cluster with no additional nodes. Be sure to choose a data center close to you.

Preparing the demo app

You can find the demo app in the node-app folder in the repo you forked. Use this folder for the following steps to build and push the docker image to the GitLab registry:

docker login registry.gitlab.com

docker build . -t registry.gitlab.com/<substiture repo name here>/demo-app-1

docker push registry.gitlab.com/<substiture repo name here>/demo-app-1

GitLab offers a free image registry with every git repo – even free tier ones. You can use these to store your built image, but be aware that the registry inherits the privacy setting of the git repo, you can’t change them separately.

Once the image is ready, be sure to update the values.yaml file with the correct image url and use helm to generate the resources.yaml file. You can then deploy everything using kubectl:

helm template -f "./helm/demo-app/values.yaml" "./helm/demo-app" > "./helm/demo-app/resources/resources.yaml"

kubectl apply -f helm/demo-app/resources/resources.yaml

The only purpose of these demo-app resources’ is to showcase the ArgoCD UI capabilities, that’s why it also contains an Ingress resource as a plus.

Install Argo CD into the cluster

Argo CD provides a yaml file that installs everything you’ll need and it’s available online. The most important thing here is to make sure that you install it into the `argocd` namespace, otherwise, you’ll run into some errors later and Argo CD will not be usable.

kubectl create namespace argocd

kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

From here, you can use Kubernetes port-forwarding to access the UI of Argo CD:

kubectl -n argocd port-forward svc/argocd-server 8080:443

This will expose the service on localhost:8080 – we will use the UI to set up the connection to GitLab, but it could also be done via the command line tool.

Argo CD setup

To log in on the UI, use `admin` as username, and the password retrieved by this command:

kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d

Once you’re logged in, connect your fork of the demo app repo from the Repositories inside the Settings menu on the left side. Here, we can choose between ssh and https authentication – for this demo, we’ll use https, but for ssh, you’d only need to set up a key pair for use.

argo cd repo connect

Create an API key on GitLab and use it in place of a password alongside your username to connect the repo. An API key allows for some measure of access control as opposed to using your account password.

After successfully connecting the repository, the only thing left is to set up an Application, which will take care of synchronizing the state of our deployment with that described in the GitLab repo.

argo cd tutorial how to set up a new application

You’ll need to choose a branch or a tag to use to monitor. Let’s choose the master branch for now – it should contain the latest stable code anyway. Setting the sync policy to automatic allows for automatic deployments when the git repo is updated, and also provides automatic pruning and self-healing capabilities.

argo cd application setup

Be sure to set the destination cluster to the one available in the dropdown and use the `demo` namespace. If everything is set correctly, Argo CD should now start syncing the deployment state.

Features of Argo CD

From the application view, you can now see the different parts that comprise our demo application.

argo cd app overview

Clicking on any of these parts allows for checking the diff of the deployed config, and the one checked into git, as well as the yaml files themselves separately. The diff should be empty for now, but we’ll see it in action once we make some changes or if you disable automatic syncing.

argo cd container details

You also have access to the logs from the pods here, which can be quite useful – logs are not retained between different pod instances, which means that they are lost on the deletion of a pod, however.

argo cd container logs

It is also possible to handle rollbacks from here, clicking on the “History and Rollback” button. Here, you can see all the different versions that have been deployed to our cluster by commit. 

You can re-deploy any of them using the … menu on the top right, and selecting “Redeploy” – this feature needs automatic deployment to be turned off. However, you’ll be prompted to do so here.

These should cover the most important parts of the UI and what is available in Argo CD. Next up, we’ll take a look at how the deployment update happens when code changes on GitLab.

Updating the deployment

With the setup done, any changes you make to the configuration that you push to the master branch should be reflected on the deployment shortly after.

A very simple way to check out the updating process is to bump up the `replicaCount` in values.yaml to 2 (or more), and run the helm command again to generate the resources.yaml. 

Then, commit and push to master and monitor the update process on the Argo CD UI.

You should see a new event in the demo-app events, with the reason `ScalingReplicaSet`.

argo cd scaling event

You can double-check the result using kubectl, where you should now see two instances of the demo-app running:

kubectl -n demo get pod

There is another branch prepared in the repo, called second-app, which has another app that you can deploy, so you can see some more of the update process and diffs. It is quite similar to how the previous deployment works.

First, you’ll need to merge the second-app branch into master – this will allow the changes to be automatically deployed, as we set it up already. Then, from the node-app-2 folder, build and push the docker image. Make sure to have a different version tag for it, so we can use the same repo!

docker build . -t registry.gitlab.com/<substitute repo name here>/demo-app-2

docker push registry.gitlab.com/<substitute repo name here>/demo-app-2

You can set deployments to manual for this step, to be able to take a better look at the diff before the actual update happens. You can do this from the sync settings part of `App details`.

argo cd sync policy

Generate the updated resources file afterwards, then commit and push it to git to trigger the update in Argo CD:

helm template -f "./helm/demo-app/values.yaml" "./helm/demo-app" > "./helm/demo-app/resources/resources.yaml"

This should result in a diff appearing `App details` -> `Diff` for you to check out. You can either deploy it manually or just turn auto-deploy back.

ArgoCD safeguards you from those resource changes that are drifting from the latest source-controlled version of your code. Let’s try to manually scale up the deployment to 5 instances:

Get the name of the replica set:

kubectl -n demo get rs

Scale it to 5 instances:

kubectl -n demo scale --replicas=5 rs/demo-app-<number>

If you are quick enough, you can catch the changes applied on the ArgoCD Application Visualization as it tries to add those instances. However, ArgoCD will prevent this change, because it would drift from the source controlled version of the deployment. It also scales the deployment down to the defined value in the latest commit (in my example it was set to 3). 

The downscale event can be found under the `demo-app` deployment events, as shown below:

how to scale down kubernetes with argo cd

From here, you can experiment with whatever changes you’d like!

Finishing our ArgoCD Kubernetes Tutorial

This was our quick introduction to using ArgoCD, which can make your GitOps workflow safer and more convenient.

Stay tuned, as we’re planning to take a look at the other heavy-hitter next time: Flux.

This article was written by Janos Kubisch, senior engineer at RisingStack.

How to Deploy a Ceph Storage to Bare Virtual Machines

Ceph is a freely available storage platform that implements object storage on a single distributed computer cluster and provides interfaces for object-, block- and file-level storage. Ceph aims primarily for completely distributed operation without a single point of failure. Ceph storage manages data replication and is generally quite fault-tolerant. As a result of its design, the system is both self-healing and self-managing.

Ceph has loads of benefits and great features, but the main drawback is that you have to host and manage it yourself. In this post, we’ll check two different approaches of virtual machine deployment with Ceph.

Anatomy of a Ceph cluster

Before we dive into the actual deployment process, let’s see what we’ll need to fire up for our own Ceph cluster.

There are three services that form the backbone of the cluster

  • ceph monitors (ceph-mon) maintain maps of the cluster state and are also responsible for managing authentication between daemons and clients
  • managers (ceph-mgr) are responsible for keeping track of runtime metrics and the current state of the Ceph cluster
  • object storage daemons (ceph-osd) store data, handle data replication, recovery, rebalancing, and provide some ceph monitoring information.

Additionally, we can add further parts to the cluster to support different storage solutions

  • metadata servers (ceph-mds) store metadata on behalf of the Ceph Filesystem
  • rados gateway (ceph-rgw) is an HTTP server for interacting with a Ceph Storage Cluster that provides interfaces compatible with OpenStack Swift and Amazon S3.

There are multiple ways of deploying these services. We’ll check two of them:

  • first, using the ceph/deploy tool,
  • then a docker-swarm based vm deployment.

Let’s kick it off!

Ceph Setup

Okay, a disclaimer first. As this is not a production infrastructure, we’ll cut a couple of corners.

You should not run multiple different Ceph demons on the same host, but for the sake of simplicity, we’ll only use 3 virtual machines for the whole cluster.

In the case of OSDs, you can run multiple of them on the same host, but using the same storage drive for multiple instances is a bad idea as the disk’s I/O speed might limit the OSD daemons’ performance.

For this tutorial, I’ve created 4 EC2 machines in AWS: 3 for Ceph itself and 1 admin node. For ceph-deploy to work, the admin node requires passwordless SSH access to the nodes and that SSH user has to have passwordless sudo privileges.

In my case, as all machines are in the same subnet on AWS, connectivity between them is not an issue. However, in other cases editing the hosts file might be necessary to ensure proper connection.

Depending on where you deploy Ceph security groups, firewall settings or other resources have to be adjusted to open these ports

  • 22 for SSH
  • 6789 for monitors
  • 6800:7300 for OSDs, managers and metadata servers
  • 8080 for dashboard
  • 7480 for rados gateway

Without further ado, let’s start deployment.

Ceph Storage Deployment

Install prerequisites on all machines

$ sudo apt update
$ sudo apt -y install ntp python

For Ceph to work seamlessly, we have to make sure the system clocks are not skewed. The suggested solution is to install ntp on all machines and it will take care of the problem. While we’re at it, let’s install python on all hosts as ceph-deploy depends on it being available on the target machines.

Prepare the admin node

$ ssh -i ~/.ssh/id_rsa -A ubuntu@

As all the machines have my public key added to known_hosts thanks to AWS, I can use ssh agent forwarding to access the Ceph machines from the admin node. The first line ensures that my local ssh agent has the proper key in use and the -A flag takes care of forwarding my key.

$ wget -q -O- 'https://download.ceph.com/keys/release.asc' | sudo apt-key add -
echo deb https://download.ceph.com/debian-nautilus/ $(lsb_release -sc) main | sudo tee /etc/apt/sources.list.d/ceph.list
$ sudo apt update
$ sudo apt -y install ceph-deploy

We’ll use the latest nautilus release in this example. If you want to deploy a different version, just change the debian-nautilus part to your desired release (luminous, mimic, etc.).

$ echo "StrictHostKeyChecking no" | sudo tee -a /etc/ssh/ssh_config > /dev/null


$ ssh-keyscan -H,, >> ~/.ssh/known_hosts

Ceph-deploy uses SSH connections to manage the nodes we provide. Each time you SSH to a machine that is not in the list of known_hosts (~/.ssh/known_hosts), you’ll get prompted whether you want to continue connecting or not. This interruption does not mesh well with the deployment process, so we either have to use ssh-keyscan to grab the fingerprint of all the target machines or disable the strict host key checking outright. ip-10-0-0-124.eu-north-1.compute.internal ip-10-0-0-124 ip-10-0-0-216.eu-north-1.compute.internal ip-10-0-0-216 ip-10-0-0-104.eu-north-1.compute.internal ip-10-0-0-104

Even though the target machines are in the same subnet as our admin and they can access each other, we have to add them to the hosts file (/etc/hosts) for ceph-deploy to work properly. Ceph-deploy creates monitors by the provided hostname, so make sure it matches the actual hostname of the machines otherwise the monitors won’t be able to join the quorum and the deployment fails. Don’t forget to reboot the admin node for the changes to take effect.

$ mkdir ceph-deploy
$ cd ceph-deploy

As a final step of the preparation, let’s create a dedicated folder as ceph-deploy will create multiple config and key files during the process.

Deploy resources

$ ceph-deploy new ip-10-0-0-124 ip-10-0-0-216 ip-10-0-0-104

The command ceph-deploy new creates the necessary files for the deployment. Pass it the hostnames of the monitor nodes, and it will create cepf.conf and ceph.mon.keyring along with a log file.

The ceph-conf should look something like this

fsid = 0572e283-306a-49df-a134-4409ac3f11da
mon_initial_members = ip-10-0-0-124, ip-10-0-0-216, ip-10-0-0-104
mon_host =,,
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx

It has a unique ID called fsid, the monitor hostnames and addresses and the authentication modes. Ceph provides two authentication modes: none (anyone can access data without authentication) or cephx (key based authentication).

The other file, the monitor keyring is another important piece of the puzzle, as all monitors must have identical keyrings in a cluster with multiple monitors. Luckily ceph-deploy takes care of the propagation of the key file during virtual deployments.

$ ceph-deploy install --release nautilus ip-10-0-0-124 ip-10-0-0-216 ip-10-0-0-104

As you might have noticed so far, we haven’t installed ceph on the target nodes yet. We could do that one-by-one, but a more convenient way is to let ceph-deploy take care of the task. Don’t forget to specify the release of your choice, otherwise you might run into a mismatch between your admin and targets.

$ ceph-deploy mon create-initial

Finally, the first piece of the cluster is up and running! create-initial will deploy the monitors specified in ceph.conf we generated previously and also gather various key files. The command will only complete successfully if all the monitors are up and in the quorum.

$ ceph-deploy admin ip-10-0-0-124 ip-10-0-0-216 ip-10-0-0-104

Executing ceph-deploy admin will push a Ceph configuration file and the ceph.client.admin.keyring to the /etc/ceph directory of the nodes, so we can use the ceph CLI without having to provide the ceph.client.admin.keyring each time to execute a command.

At this point, we can take a peek at our cluster. Let’s SSH into a target machine (we can do it directly from the admin node thanks to agent forwarding) and run sudo ceph status.

$ sudo ceph status
	id: 	0572e283-306a-49df-a134-4409ac3f11da
	health: HEALTH_OK

	mon: 3 daemons, quorum ip-10-0-0-104,ip-10-0-0-124,ip-10-0-0-216 (age 110m)
mgr: no daemons active
osd: 0 osds: 0 up, 0 in

  	pools:   0 pools, 0 pgs
objects: 0 objects, 0 B
	usage:   0 B used, 0 B / 0 B avail

Here we get a quick overview of what we have so far. Our cluster seems to be healthy and all three monitors are listed under services. Let’s go back to the admin and continue adding pieces.

$ ceph-deploy mgr create ip-10-0-0-124

For luminous+ builds a manager daemon is required. It’s responsible for monitoring the state of the Cluster and also manages modules/plugins.

Okay, now we have all the management in place, let’s add some storage to the cluster to make it actually useful, shall we?

First, we have to find out (on each target machine) the label of the drive we want to use. To fetch the list of available disks on a specific node, run

$ ceph-deploy disk list ip-10-0-0-104

Here’s a sample output:

ceph storage deploy sample output
$ ceph-deploy osd create --data /dev/nvme1n1 ip-10-0-0-124
$ ceph-deploy osd create --data /dev/nvme1n1 ip-10-0-0-216
$ ceph-deploy osd create --data /dev/nvme1n1 ip-10-0-0-104

In my case the label was nvme1n1 on all 3 machines (courtesy of AWS), so to add OSDs to the cluster I just ran these 3 commands.

At this point, our cluster is basically ready. We can run ceph status to see that our monitors, managers and OSDs are up and running. But nobody wants to SSH into a machine every time to check the status of the cluster. Luckily there’s a pretty neat dashboard that comes with Ceph, we just have to enable it.

…Or at least that’s what I thought. The dashboard was introduced in luminous release and was further improved in mimic. However, currently we’re deploying nautilus, the latest version of Ceph. After trying the usual way of enabling the dashboard via a manager

$ sudo ceph mgr module enable dashboard

we get an error message saying Error ENOENT: all mgr daemons do not support module 'dashboard', pass --force to force enablement.

Turns out, in nautilus the dashboard package is no longer installed by default. We can check the available modules by running

$ sudo ceph mgr module ls

and as expected, dashboard is not there, it comes in a form a separate package. So we have to install it first, luckily it’s pretty easy.

$ sudo apt install -y ceph-mgr-dashboard

Now we can enable it, right? Not so fast. There’s a dependency that has to be installed on all manager hosts, otherwise we get a slightly cryptic error message saying Error EIO: Module 'dashboard' has experienced an error and cannot handle commands: No module named routes.

$ sudo apt install -y python-routes

We’re all set to enable the dashboard module now. As it’s a public-facing page that requires login, we should set up a cert for SSL. For the sake of simplicity, I’ve just disabled the SSL feature. You should never do this in production, check out the official docs to see how to set up a cert properly. Also, we’ll need to create an admin user so we can log in to our dashboard.

$ sudo ceph mgr module enable dashboard
$ sudo ceph config set mgr mgr/dashboard/ssl false
$ sudo ceph dashboard ac-user-create admin secret administrator

By default, the dashboard is available on the host running the manager on port 8080. After logging in, we get an overview of the cluster status, and under the cluster menu, we get really detailed overviews of each running daemon.

ceph storage deployment dashboard
ceph cluster dashboard

If we try to navigate to the Filesystems or Object Gateway tabs, we get a notification that we haven’t configured the required resources to access these features. Our cluster can only be used as a block storage right now. We have to deploy a couple of extra things to extend its usability.

Quick detour: In case you’re looking for a company that can help you with Ceph, or DevOps in general, feel free to reach out to us at RisingStack!

Using the Ceph filesystem

Going back to our admin node, running

$ ceph-deploy mds create ip-10-0-0-124 ip-10-0-0-216 ip-10-0-0-104

will create metadata servers, that will be inactive for now, as we haven’t enabled the feature yet. First, we need to create two RADOS pools, one for the actual data and one for the metadata.

$ sudo ceph osd pool create cephfs_data 8
$ sudo ceph osd pool create cephfs_metadata 8

There are a couple of things to consider when creating pools that we won’t cover here. Please consult the documentation for further details.

After creating the required pools, we’re ready to enable the filesystem feature

$ sudo ceph fs new cephfs cephfs_metadata cephfs_data

The MDS daemons will now be able to enter an active state, and we are ready to mount the filesystem. We have two options to do that, via the kernel driver or as FUSE with ceph-fuse.

Before we continue with the mounting, let’s create a user keyring that we can use in both solutions for authorization and authentication as we have cephx enabled. There are multiple restrictions that can be set up when creating a new key specified in the docs. For example:

$ sudo ceph auth get-or-create client.user mon 'allow r' mds 'allow r, allow rw path=/home/cephfs' osd 'allow rw pool=cephfs_data' -o /etc/ceph/ceph.client.user.keyring

will create a new client key with the name user and output it into ceph.client.user.keyring. It will provide write access for the MDS only to the /home/cephfs directory, and the client will only have write access within the cephfs_data pool.

Mounting with the kernel

Now let’s create a dedicated directory and then use the key from the previously generated keyring to mount the filesystem with the kernel.

$ sudo mkdir /mnt/mycephfs
$ sudo mount -t ceph /mnt/mycephfs -o name=user,secret=AQBxnDFdS5atIxAAV0rL9klnSxwy6EFpR/EFbg==

Attaching with FUSE

Mounting the filesystem with FUSE is not much different either. It requires installing the ceph-fuse package.

$ sudo apt install -y ceph-fuse

Before we run the command we have to retrieve the ceph.conf and ceph.client.user.keyring files from the Ceph host and put the in /etc/ceph. The easiest solution is to use scp.

$ sudo scp ubuntu@ /etc/ceph/ceph.conf
$ sudo scp ubuntu@ /etc/ceph/ceph.keyring

Now we are ready to mount the filesystem.

$ sudo mkdir cephfs
$ sudo ceph-fuse -m cephfs

Using the RADOS gateway

To enable the S3 management feature of the cluster, we have to add one final piece, the rados gateway.

$ ceph-deploy rgw create ip-10-0-0-124

For the dashboard, it’s required to create a radosgw-admin user with the system flag to enable the Object Storage management interface. We also have to provide the user’s access_key and secret_key to the dashboard before we can start using it.

$ sudo radosgw-admin user create --uid=rg_wadmin --display-name=rgw_admin --system
$ sudo ceph dashboard set-rgw-api-access-key <access_key>
$ sudo ceph dashboard set-rgw-api-secret-key <secret_key>

Using the Ceph Object Storage is really easy as RGW provides an interface identical to S3. You can use your existing S3 requests and code without any modifications, just have to change the connection string, access, and secret keys.

Ceph Storage Monitoring

The dashboard we’ve deployed shows a lot of useful information about our cluster, but monitoring is not its strongest suit. Luckily Ceph comes with a Prometheus module. After enabling it by running:

$ sudo ceph mgr module enable prometheus

A wide variety of metrics will be available on the given host on port 9283 by default. To make use of these exposed data, we’ll have to set up a prometheus instance.

I strongly suggest running the following containers on a separate machine from your Ceph cluster. In case you are just experimenting (like me) and don’t want to use a lot of VMs, make sure you have enough memory and CPU left on your virtual machine before firing up docker, as it can lead to strange behaviour and crashes if it runs out of resources.

There are multiple ways of firing up Prometheus, probably the most convenient is with docker. After installing docker on your machine, create a prometheus.yml file to provide the endpoint where it can access our Ceph metrics.

# /etc/prometheus.yml

  - job_name: 'ceph'
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    - targets: [']

Then launch the container itself by running:

$ sudo docker run -p 9090:9090 -v /etc/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus

Prometheus will start scraping our data, and it will show up on its dashboard. We can access it on port 9090 on its host machine. Prometheus dashboard is great but does not provide a very eye-pleasing dashboard. That’s the main reason why it’s usually used in pair with Graphana, which provides awesome visualizations for the data provided by Prometheus. It can be launched with docker as well.

$ sudo docker run -d -p 3000:3000 grafana/grafana

Grafana is fantastic when it comes to visualizations, but setting up dashboards can be a daunting task. To make our lives easier, we can load one of the pre-prepared dashboards, for example this one.

ceph storage grafana monitoring

Ceph Deployment: Lessons Learned & Next Up

CEPH can be a great alternative to AWS S3 or other object storages when running in the public operating your service in the private cloud is simply not an option. The fact that it provides an S3 compatible interface makes it a lot easier to port other tools that were written with a “cloud first” mentality. It also plays nicely with Prometheus, thus you don’t need to worry about setting up proper monitoring for it, or you can swap it a more simple, more battle-hardened solution such as Nagios.

In this article, we deployed CEPH to bare virtual machines, but you might need to integrate it into your Kubernetes or Docker Swarm cluster. While it is perfectly fine to install it on VMs next to your container orchestration tool, you might want to leverage the services they provide when you deploy your CEPH cluster. If that is your use case, stay tuned for our next post covering CEPH where we’ll take a look at the black magic required to use CEPH on Docker Swarm and Kubernetes.

In the next CEPH tutorial which we’ll release next week, we’re going to take a look at valid ceph storage alternatives with Docker or with Kubernetes.

PS: Feel free to reach out to us at RisingStack in case you need help with Ceph or Ops in general!

Async Await in Node.js – How to Master it?

In this article, you will learn how you can simplify your callback or Promise based Node.js application with async functions (async await).

Whether you’ve looked at async/await and promises in JavaScript before, but haven’t quite mastered them yet, or just need a refresher, this article aims to help you.

async await nodejs explained

What are async functions in Node.js?

Async functions are available natively in Node and are denoted by the async keyword in their declaration. They always return a promise, even if you don’t explicitly write them to do so. Also, the await keyword is only available inside async functions at the moment – it cannot be used in the global scope.

In an async function, you can await any Promise or catch its rejection cause.

So if you had some logic implemented with promises:

function handler (req, res) {
  return request('https://user-handler-service')
    .catch((err) => {
      logger.error('Http error', err);
      error.logged = true;
      throw err;
    .then((response) => Mongo.findOne({ user: response.body.user }))
    .catch((err) => {
      !error.logged && logger.error('Mongo error', err);
      error.logged = true;
      throw err;
    .then((document) => executeLogic(req, res, document))
    .catch((err) => {
      !error.logged && console.error(err);

You can make it look like synchronous code using async/await:

async function handler (req, res) {
  let response;
  try {
    response = await request('https://user-handler-service')  ;
  } catch (err) {
    logger.error('Http error', err);
    return res.status(500).send();

  let document;
  try {
    document = await Mongo.findOne({ user: response.body.user });
  } catch (err) {
    logger.error('Mongo error', err);
    return res.status(500).send();

  executeLogic(document, req, res);

Currently in Node you get a warning about unhandled promise rejections, so you don’t necessarily need to bother with creating a listener. However, it is recommended to crash your app in this case as when you don’t handle an error, your app is in an unknown state. This can be done either by using the --unhandled-rejections=strict CLI flag, or by implementing something like this:

process.on('unhandledRejection', (err) => { 

Automatic process exit will be added in a future Node release – preparing your code ahead of time for this is not a lot of effort, but will mean that you don’t have to worry about it when you next wish to update versions.

Patterns with async functions in JavaScript

There are quite a couple of use cases when the ability to handle asynchronous operations as if they were synchronous comes very handy, as solving them with Promises or callbacks requires the use of complex patterns.

Since node@10.0.0, there is support for async iterators and the related for-await-of loop. These come in handy when the actual values we iterate over, and the end state of the iteration, are not known by the time the iterator method returns – mostly when working with streams. Aside from streams, there are not a lot of constructs that have the async iterator implemented natively, so we’ll cover them in another post.

Retry with exponential backoff

Implementing retry logic was pretty clumsy with Promises:

function request(url) {
  return new Promise((resolve, reject) => {
    setTimeout(() => {
      reject(`Network error when trying to reach ${url}`);
    }, 500);

function requestWithRetry(url, retryCount, currentTries = 1) {
  return new Promise((resolve, reject) => {
    if (currentTries <= retryCount) {
      const timeout = (Math.pow(2, currentTries) - 1) * 100;
        .catch((error) => {
          setTimeout(() => {
            console.log('Error: ', error);
            console.log(`Waiting ${timeout} ms`);
            requestWithRetry(url, retryCount, currentTries + 1);
          }, timeout);
    } else {
      console.log('No retries left, giving up.');
      reject('No retries left, giving up.');

  .then((res) => {
  .catch(err => {

This would get the job done, but we can rewrite it with async/await and make it a lot more simple.

function wait (timeout) {
  return new Promise((resolve) => {
    setTimeout(() => {
    }, timeout);

async function requestWithRetry (url) {
  const MAX_RETRIES = 10;
  for (let i = 0; i <= MAX_RETRIES; i++) {
    try {
      return await request(url);
    } catch (err) {
      const timeout = Math.pow(2, i);
      console.log('Waiting', timeout, 'ms');
      await wait(timeout);
      console.log('Retrying', err.message, i);

A lot more pleasing to the eye isn’t it?

Intermediate values

Not as hideous as the previous example, but if you have a case where 3 asynchronous functions depend on each other the following way, then you have to choose from several ugly solutions.

functionA returns a Promise, then functionB needs that value and functionC needs the resolved value of both functionA‘s and functionB‘s Promise.

Solution 1: The .then Christmas tree

function executeAsyncTask () {
  return functionA()
    .then((valueA) => {
      return functionB(valueA)
        .then((valueB) => {          
          return functionC(valueA, valueB)

With this solution, we get valueA from the surrounding closure of the 3rd then and valueB as the value the previous Promise resolves to. We cannot flatten out the Christmas tree as we would lose the closure and valueA would be unavailable for functionC.

Solution 2: Moving to a higher scope

function executeAsyncTask () {
  let valueA
  return functionA()
    .then((v) => {
      valueA = v
      return functionB(valueA)
    .then((valueB) => {
      return functionC(valueA, valueB)

In the Christmas tree, we used a higher scope to make valueA available as well. This case works similarly, but now we created the variable valueA outside the scope of the .then-s, so we can assign the value of the first resolved Promise to it.

This one definitely works, flattens the .then chain and is semantically correct. However, it also opens up ways for new bugs in case the variable name valueA is used elsewhere in the function. We also need to use two names — valueA and v — for the same value.

Are you looking for help with enterprise-grade Node.js Development?
Hire the Node developers of RisingStack!

Solution 3: The unnecessary array

function executeAsyncTask () {
  return functionA()
    .then(valueA => {
      return Promise.all([valueA, functionB(valueA)])
    .then(([valueA, valueB]) => {
      return functionC(valueA, valueB)

There is no other reason for valueA to be passed on in an array together with the Promise functionB then to be able to flatten the tree. They might be of completely different types, so there is a high probability of them not belonging to an array at all.

Solution 4: Write a helper function

const converge = (...promises) => (...args) => {
  let [head, ...tail] = promises
  if (tail.length) {
    return head(...args)
      .then((value) => converge(...tail)(...args.concat([value])))
  } else {
    return head(...args)

  .then((valueA) => converge(functionB, functionC)(valueA))

You can, of course, write a helper function to hide away the context juggling, but it is quite difficult to read, and may not be straightforward to understand for those who are not well versed in functional magic.

By using async/await our problems are magically gone:

async function executeAsyncTask () {
  const valueA = await functionA();
  const valueB = await functionB(valueA);
  return function3(valueA, valueB);

Multiple parallel requests with async/await

This is similar to the previous one. In case you want to execute several asynchronous tasks at once and then use their values at different places, you can do it easily with async/await:

async function executeParallelAsyncTasks () {
  const [ valueA, valueB, valueC ] = await Promise.all([ functionA(), functionB(), functionC() ]);

As we’ve seen in the previous example, we would either need to move these values into a higher scope or create a non-semantic array to pass these values on.

Array iteration methods

You can use mapfilter and reduce with async functions, although they behave pretty unintuitively. Try guessing what the following scripts will print to the console:

  1. map
function asyncThing (value) {
  return new Promise((resolve) => {
    setTimeout(() => resolve(value), 100);

async function main () {
  return [1,2,3,4].map(async (value) => {
    const v = await asyncThing(value);
    return v * 2;

  .then(v => console.log(v))
  .catch(err => console.error(err));
  1. filter
function asyncThing (value) {
  return new Promise((resolve) => {
    setTimeout(() => resolve(value), 100);

async function main () {
  return [1,2,3,4].filter(async (value) => {
    const v = await asyncThing(value);
    return v % 2 === 0;

  .then(v => console.log(v))
  .catch(err => console.error(err));
  1. reduce

function asyncThing (value) {
  return new Promise((resolve) => {
    setTimeout(() => resolve(value), 100);

async function main () {
  return [1,2,3,4].reduce(async (acc, value) => {
    return await acc + await asyncThing(value);
  }, Promise.resolve(0));

  .then(v => console.log(v))
  .catch(err => console.error(err));


  1. [ Promise { <pending> }, Promise { <pending> }, Promise { <pending> }, Promise { <pending> } ]
  2. [ 1, 2, 3, 4 ]
  3. 10

If you log the returned values of the iteratee with map you will see the array we expect: [ 2, 4, 6, 8 ]. The only problem is that each value is wrapped in a Promise by the AsyncFunction.

So if you want to get your values, you’ll need to unwrap them by passing the returned array to a Promise.all:

  .then(v => Promise.all(v))
  .then(v => console.log(v))
  .catch(err => console.error(err));

Originally, you would first wait for all your promises to resolve and then map over the values:

function main () {
  return Promise.all([1,2,3,4].map((value) => asyncThing(value)));

  .then(values => values.map((value) => value * 2))
  .then(v => console.log(v))
  .catch(err => console.error(err));

This seems a bit more simple, doesn’t it?

The async/await version can still be useful if you have some long running synchronous logic in your iteratee and another long-running async task.

This way you can start calculating as soon as you have the first value – you don’t have to wait for all the Promises to be resolved to run your computations. Even though the results will still be wrapped in Promises, those are resolved a lot faster then if you did it the sequential way.

What about filter? Something is clearly wrong…

Well, you guessed it: even though the returned values are [ false, true, false, true ], they will be wrapped in promises, which are truthy, so you’ll get back all the values from the original array. Unfortunately, all you can do to fix this is to resolve all the values and then filter them.

Reducing is pretty straightforward. Bear in mind though that you need to wrap the initial value into Promise.resolve, as the returned accumulator will be wrapped as well and has to be await-ed.

.. As it is pretty clearly intended to be used for imperative code styles.

To make your .then chains more “pure” looking, you can use Ramda’s pipeP and composeP functions.

Rewriting callback-based Node.js applications

Async functions return a Promise by default, so you can rewrite any callback based function to use Promises, then await their resolution. You can use the util.promisify function in Node.js to turn callback-based functions to return a Promise-based ones.

Rewriting Promise-based applications

Simple .then chains can be upgraded in a pretty straightforward way, so you can move to using async/await right away.

function asyncTask () {
  return functionA()
    .then((valueA) => functionB(valueA))
    .then((valueB) => functionC(valueB))
    .then((valueC) => functionD(valueC))
    .catch((err) => logger.error(err))

will turn into

async function asyncTask () {
  try {
    const valueA = await functionA();
    const valueB = await functionB(valueA);
    const valueC = await functionC(valueB);
    return await functionD(valueC);
  } catch (err) {

Rewriting Node.js apps with async await

  • If you liked the good old concepts of if-else conditionals and for/while loops,
  • if you believe that a try-catch block is the way errors are meant to be handled,

you will have a great time rewriting your services using async/await.

As we have seen, it can make several patterns a lot easier to code and read, so it is definitely more suitable in several cases than Promise.then() chains. However, if you are caught up in the functional programming craze of the past years, you might wanna pass on this language feature.

Are you already using async/await in production, or you plan on never touching it? Let’s discuss it in the comments below.

Are you looking for help with enterprise-grade Node.js Development?
Hire the Node developers of RisingStack!

Sometimes you do need Kubernetes! But how should you decide?

At RisingStack, we help companies to adopt cloud-native technologies, or if they have already done so, to get the most mileage out of them.

Recently, I’ve been invited to Google DevFest to deliver a presentation on our experiences working with Kubernetes.

Below I talk about an online learning and streaming platform where the decision to use Kubernetes has been contested both internally and externally since the beginning of its development.

The application and its underlying infrastructure were designed to meet the needs of the regulations of several countries:

  • The app should be able to run on-premises, so students’ data could never leave a given country. Also, the app had to be available as a SaaS product as well.
  • It can be deployed as a single-tenant system where a business customer only hosts one instance serving a handful of users, but some schools could have hundreds of users.
  • Or it can be deployed as a multi-tenant system where the client is e.g. a government and needs to serve thousands of schools and millions of users.

[elementor-template id="3483"]

The application itself was developed by multiple, geographically scattered teams, thus a Microservices architecture was justified, but both the distributed system and the underlying infrastructure seemed to be an overkill when we considered the fact that during the product’s initial entry, most of its customers needed small instances.

Was Kubernetes suited for the job, or was it an overkill? Did our client really need Kubernetes?

Let’s figure it out.

(Feel free to check out the video presentation, or the extended article version below!)

Let’s talk a bit about Kubernetes itself!

Kubernetes is an open-source container orchestration engine that has a vast ecosystem. If you run into any kind of problem, there’s probably a library somewhere on the internet that already solves it.

But Kubernetes also has a daunting learning curve, and initially, it’s pretty complex to manage. Cloud ops / infrastructure engineering is a complex and big topic in and of itself.

Kubernetes does not really mask away the complexity from you, but plunges you into deep water as it merely gives you a unified control plane to handle all those moving parts that you need to care about in the cloud.

So, if you’re just starting out right now, then it’s better to start with small things and not with the whole package straight away! First, deploy a VM in the cloud. Use some PaaS or FaaS solutions to play around with one of your apps. It will help you gradually build up the knowledge you need on the journey.

So you want to decide if Kubernetes is for you.

First and foremost, Kubernetes is for you if you work with containers! (It kinda speaks for itself for a container orchestration system). But you should also have more than one service or instance.


Kubernetes makes sense when you have a huge microservice architecture, or you have dedicated instances per tenant having a lot of tenants as well.

Also, your services should be stateless, and your state should be stored in databases outside of the cluster. Another selling point of Kubernetes is the fine gradient control over the network.

And, maybe the most common argument for using Kubernetes is that it provides easy scalability.

Okay, and now let’s take a look at the flip side of it.

Kubernetes is not for you if you don’t need scalability!

If your services rely heavily on disks, then you should think twice if you want to move to Kubernetes or not. Basically, one disk can only be attached to a single node, so all the services need to reside on that one node. Therefore you lose node auto-scaling, which is one of the biggest selling points of Kubernetes.

For similar reasons, you probably shouldn’t use k8s if you don’t host your infrastructure in the public cloud. When you run your app on-premises, you need to buy the hardware beforehand and you cannot just conjure machines out of thin air. So basically, you also lose node auto-scaling, unless you’re willing to go hybrid cloud and bleed over some of your excess load by spinning up some machines in the public cloud.


If you have a monolithic application that serves all your customers and you need some scaling here and there, then cloud service providers can handle it for you with autoscaling groups.

There is really no need to bring in Kubernetes for that.

Let’s see our Kubernetes case-study!

Maybe it’s a little bit more tangible if we talk about an actual use case, where we had to go through the decision making process.


Online Learning Platform is an application that you could imagine as if you took your classroom and moved it to the internet.

You can have conference calls. You can share files as handouts, you can have a whiteboard, and you can track the progress of your students.

This project started during the first wave of the lockdowns around March, so one thing that we needed to keep in mind is that time to market was essential.

In other words: we had to do everything very, very quickly!

This product targets mostly schools around Europe, but it is now used by corporations as well.

So, we’re talking about millions of users from the point we go to the market.

The product needed to run on-premise, because one of the main targets were governments.

Initially, we were provided with a proposed infrastructure where each school would have its own VM, and all the services and all the databases would reside in those VMs.

Handling that many virtual machines, properly handling rollouts to those, and monitoring all of them sounded like a nightmare to begin with. Especially if we consider the fact that we only had a couple of weeks to go live.

After studying the requirements and the proposal, it was time to call the client to..

Discuss the proposed infrastructure.

So the conversation was something like this:

  • “Hi guys, we would prefer to go with Kubernetes because to handle stuff at that scale, we would need a unified control plane that Kubernetes gives us.”
  • "Yeah, sure, go for it."

And we were happy, but we still had a couple of questions:

  • “Could we, by any chance, host it on the public cloud?”
  • "Well, no, unfortunately. We are negotiating with European local governments and they tend to be squeamish about sending their data to the US. "

Okay, anyways, we can figure something out…

  • “But do the services need filesystem access?”
  • "Yes, they do."

Okay, crap! But we still needed to talk to the developers so all was not lost.

Let’s call the developers!

It turned out that what we were dealing with was an usual microservice-based architecture, which consisted of a lot of services talking over HTTP and messaging queues.

Each service had its own database, and most of them stored some files in Minio.


In case you don’t know it, Minio is an object storage system that implements the S3 API.

Now that we knew the fine-grained architectural layout, we gathered a few more questions:

  • “Okay guys, can we move all the files to Minio?”
  • "Yeah, sure, easy peasy."

So, we were happy again, but there was still another problem, so we had to call the hosting providers:

  • “Hi guys, do you provide hosted Kubernetes?”
  • "Oh well, at this scale, we can manage to do that!"

So, we were happy again, but..

Just to make sure, we wanted to run the numbers!

Our target was to be able to run 60 000 schools on the platform in the beginning, so we had to see if our plans lined up with our limitations!

We shouldn’t have more than 150 000 total pods!

10 (pod/tenant) times 6000 tenants is 60 000 Pods. We’re good!

We shouldn’t have more than 300 000 total containers!

It’s one container per pod, so we’re still good.

We shouldn’t have more than 100 pods per node and no more than 5 000 nodes.

Well, what we have is 60 000 pods over 100 pod per node. That’s already 6 000 nodes, and that’s just the initial rollout, so we’re already over our 5 000 nodes limit.


Okay, well… Crap!

But, is there a solution to this?

Sure, it’s federation!

We could federate our Kubernetes clusters..

..and overcome these limitations.

We have worked with federated systems before, so Kubernetes surely provides something for that, riiight? Well yeah, it does… kind of.

It’s the stable Federation v1 API, which is sadly deprecated.


Then we saw that Kubernetes Federation v2 is on the way!

It was still in alpha at the time when we were dealing with this issue, but the GitHub page said it was rapidly moving towards beta release. By taking a look at the releases page we realized that it had been overdue by half a year by then.

Since we only had a short period of time to pull this off, we really didn’t want to live that much on the edge.

So what could we do? We could federate by hand! But what does that mean?

In other words: what could have been gained by using KubeFed?

Having a lot of services would have meant that we needed a federated Prometheus and Logging (be it Graylog or ELK) anyway. So the two remaining aspects of the system were rollout / tenant generation, and manual intervention.

Manual intervention is tricky. To make it easy, you need a unified control plane where you can eyeball and modify anything. We could have built a custom one that gathers all information from the clusters and proxies all requests to each of them. However, that would have meant a lot of work, which we just did not have the time for. And even if we had the time to do it, we would have needed to conduct a cost/benefit analysis on it.

The main factor in the decision if you need a unified control plane for everything is scale, or in other words, the number of different control planes to handle.

The original approach would have meant 6000 different planes. That’s just way too much to handle for a small team. But if we could bring it down to 20 or so, that could be bearable. In that case, all we need is an easy mind map that leads from services to their underlying clusters. The actual route would be something like:

Service -> Tenant (K8s Namespace) -> Cluster.

The Service -> Namespace mapping is provided by Kubernetes, so we needed to figure out the Namespace -> Cluster mapping.

This mapping is also necessary to reduce the cognitive overhead and time of digging around when an outage may happen, so it needs to be easy to remember, while having to provide a more or less uniform distribution of tenants across Clusters. The most straightforward way seemed to be to base it on Geography. I’m the most familiar with Poland’s and Hungary’s Geography, so let’s take them as an example.

Poland comprises 16 voivodeships, while Hungary comprises 19 counties as main administrative divisions. Each country’s capital stands out in population, so they have enough schools to get a cluster on their own. Thus it only makes sense to create clusters for each division plus the capital. That gives us 17 or 20 clusters.

So if we get back to our original 60 000 pods, and 100 pod / tenant limitation, we can see that 2 clusters are enough to host them all, but that leaves us no room for either scaling or later expansions. If we spread them across 17 clusters – in the case of Poland for example – that means we have around 3.500 pods / cluster and 350 nodes, which is still manageable.

This could be done in a similar fashion for any European country, but still needs some architecting when setting up the actual infrastructure. And when KubeFed becomes available (and somewhat battle tested) we can easily join these clusters into one single federated cluster.

Great, we have solved the problem of control planes for manual intervention. The only thing left was handling rollouts..


As I mentioned before, several developer teams had been working on the services themselves, and each of them already had their own Gitlab repos and CIs. They already built their own Docker images, so we simply needed a place to gather them all, and roll them out to Kubernetes. So we created a GitOps repo where we stored the helm charts and set up a GitLab CI to build the actual releases, then deploy them.

From here on, it takes a simple loop over the clusters to update the services when necessary.

The other thing we needed to solve was tenant generation.

It was easy as well, because we just needed to create a CLI tool which could be set up by providing the school’s name, and its county or state.


That’s going to designate its target cluster, and then push it to our Gitops repo, and that basically triggers the same rollout as new versions.

We were almost good to go, but there was still one problem: on-premises.

Although our hosting providers turned into some kind of public cloud (or something we can think of as public clouds), we were also targeting companies who want to educate their employees.

Huge corporations – like a Bank – are just as squeamish about sending their data out to the public internet as governments, if not more..

So we needed to figure out a way to host this on servers within vaults completely separated from the public internet.


In this case, we had two main modes of operation.

  • One is when a company just wanted a boxed product and they didn’t really care about scaling it.
  • And the other one was where they expected it to be scaled, but they were prepared to handle this.

In the second case, it was kind of a bring your own database scenario, so you could set up the system in a way that we were going to connect to your database.

And in the other case, what we could do is to package everything — including databases — in one VM, in one Kubernetes cluster. But! I just wrote above that you probably shouldn’t use disks and shouldn’t have databases within your cluster, right?

However, in that case, we already had a working infrastructure.

Kubernetes provided us with infrastructure as code already, so it only made sense to use that as a packaging tool as well, and use Kubespray to just spray it to our target servers.

It wasn’t a problem to have disks and DBs within our cluster because the target were companies that didn’t want to scale it anyway.

So it’s not about scaling. It is mostly about packaging!

Previously I told you, that you probably don’t want to do this on-premises, and this is still right! If that’s your main target, then you probably shouldn’t go with Kubernetes.

However, as our main target was somewhat of a public cloud, it wouldn’t have made sense to just recreate the whole thing – basically create a new product in a sense – for these kinds of servers.

So as it is kind of a spin-off, it made sense here as well as a packaging solution.

Basically, I’ve just given you a bullet point list to help you determine whether Kubernetes is for you or not, and then I just tore it apart and threw it into a basket.

And the reason for this is – as I also mentioned:

Cloud ops is difficult!

There aren’t really one-size-fits-all solutions, so basing your decision on checklists you see on the internet is definitely not a good idea.

We’ve seen that a lot of times where companies adopt Kubernetes because it seems to fit, but when they actually start working with it, it turns out to be an overkill.

If you want to save yourself about a year or two of headache, it’s a lot better to first ask an expert, and just spend a couple of hours or days going through your use cases, discussing those and save yourself that year of headache.

In case you’re thinking about adopting Kubernetes, or getting the most out of it, don’t hesitate to reach out to us at info@risingstack.com, or by using the contact form below!

Distributed Load Testing with Jmeter

Many of you have probably used apache Jmeter for load testing before. Still, it is easy to run into the limits imposed by running it on just one machine when trying to make sure that our API will be able to serve hundreds of thousands or even millions of users.

We can get around this issue by deploying and running our tests to multiple machines in the cloud.

In this article, we will take a look at one way to distribute and run Jmeter tests along multiple droplets on DigitalOcean using Terraform, Ansible, and a little bit of bash scripting to automate the process as much as possible.

Background: During the COVID19 outbreak induced lockdowns, we’ve been tasked by a company (who builds an e-learning platform primarily for schools) to build out an infrastructure that is:

  • geo redundant,
  • supports both single and multi tenant deployments ,
  • can be easily scaled to serve at least 1.5 million users in huge bursts,
  • and runs on-premises.

To make sure the application is able to handle these requirements, we needed to set up the infrastructure, and model a reasonably high burst in requests to get an idea about the load the application and its underlying infrastructure is able to serve.

In this article, we’ll share practical advice and some of the scripts we used to automate the load-testing process using Jmeter, Terraform and Ansible.

Let’s Start!

Have these tools installed before you begin!

brew install ansible
brew install terraform
brew install jmeter

You can just run them from your own machine. The full codebase is available on Github at RisingStack/distributed-loadtests-jmeter for your convenience.

Why do we use Jmeter for distributed load testing?

Jmeter is not my favorite tool for load testing owing mostly to the fact that scripting it is just awkward. But looking at the other tools that support distribution, it seems to be the best free one for now. K6 looks good, but right now it does not support distribution outside the paid, hosted version. Locust is another interesting one, but it’s focusing too much on random test picking, and if that’s not what I’m looking for, it is quite awkward to use as well – just not flexible enough right now.

So, back to Jmeter!

Terraform is infrastructure as code, which allows us to describe the resources we want to use in our deployment and configure the droplets so we have them ready for running some tests. This will, in turn, be deployed by Ansible to our cloud service provider of choice, DigitalOcean – though with some changes, you can make this work with any other provider, as well as your on-premise machines if you wish so.

Deploying the infrastructure

There will be two kinds of instances we’ll use:

  • primary, of which we’ll have one coordinating the testing,
  • and runners, that we can have any number of.

In the example, we’re going to go with two, but we’ll see that it is easy to change this when needed.

You can check the variables.tf file to see what we’ll use. You can use these to customise most aspects of the deployment to fit your needs. This file holds the vars that will be plugged into the other template files – main.tf and provider.tf.

The one variable you’ll need to provide to Terraform for the example setup to work is your DigitalOcean api token, that you can export like this from the terminal:

export TF_VAR_do_token=DO_TOKEN

Should you wish to change the number of test runner instances, you can do so by exporting this other environment variable:

export TF_VAR_instance_count=2

You will need to generate two ssh key pairs, one for the root user, and one for a non-privileged user. These will be used by Ansible, which uses ssh to deploy the testing infrastructure as it is agent-less. We will also use the non-privileged user when starting the tests for copying over files and executing commands on the primary node. The keys should be set up with correct permissions, otherwise, you’ll just get an error.

Set the permissions to 600 or 700 like this:

chmod 600 /path/to/folder/with/keys/*

To begin, we should open a terminal in the terraform folder, and call terraform init which will prepare the working directory. Thisl needs to be called again if the configuration changes.

You can use terraform plan that will output a summary of what the current changes will look like to the console to double-check if everything is right. At the first run, it will be what the deployment will look like.

Next, we call terraform apply which will actually apply the changes according to our configuration, meaning we’ll have our deployment ready when it finishes! It also generates a .tfstate file with all the information about said deployment.

If you wish to dismantle the deployment after the tests are done, you can use terraform destroy. You’ll need the .tfstate file for this to work though! Without the state file, you need to delete the created droplets by hand, and also remove the ssh key that has been added to DigitalOcean.

Running the Jmeter tests

The shell script we are going to use for running the tests is for convenience – it consists of copying the test file to our primary node, cleaning up files from previous runs, running the tests, and then fetching the results.


set -e

# Argument parsing, with options for long and short names
for i in "$@"
case $i in
    # i#*= This removes the shortest substring ending with
    # '=' from the value of variable i - leaving us with just the
    # value of the argument (i is argument=value)

# Check if we got all the arguments we'll need
if [ -z "$TESTFILE" ] || [ ! -f "$TESTFILE" ]; then
    echo "Please provide a test file"
    exit 1

if [ -z "$OUTDIR" ]; then
    echo "Please provide a result destination directory"
    exit 1

if [ -z "$IDENTITYFILE" ]; then
    echo "Please provide an identity file for ssh access"
    exit 1

if [ -z "$PRIMARY" ]; then
  PRIMARY=$(terraform output primary_address)

# Copy the test file to the primary node
scp -i "$IDENTITYFILE" -o IdentitiesOnly=yes -oStrictHostKeyChecking=no "$TESTFILE" "runner@$PRIMARY:/home/runner/jmeter/test.jmx"
# Remove files from previous runs if any, then run the current test
ssh -i "$IDENTITYFILE" -o IdentitiesOnly=yes -oStrictHostKeyChecking=no "runner@$PRIMARY" << "EOF"
 rm -rf /home/runner/jmeter/result
 rm -f /home/runner/jmeter/result.log
 cd jmeter/bin ; ./jmeter -n -r -t ../test.jmx -l ../result.log -e -o ../result -Djava.rmi.server.hostname=$(hostname -I | awk ' {print $1}')
# Get the results
scp -r -i "$IDENTITYFILE" -o IdentitiesOnly=yes -oStrictHostKeyChecking=no "runner@$PRIMARY":/home/runner/jmeter/result "$OUTDIR"

Running the script will require the path to the non-root ssh key. The call will look something like this:

bash run.sh -i=/path/to/non-root/ssh/key  -f=/path/to/test/file -o=/path/to/results/dir

You can also supply the IP of the primary node using -p= or --primary-ip= in case you don’t have access to the .tfstate file. Otherwise, the script will ask terraform for the IP.

Jmeter will then take care of distributing the tests across the runner nodes, and it will aggregate the data when they finish. The only thing we need to keep in mind is that the number of users we set for our test to use will not be split but will be multiplied. As an example, if you set the user count to 100, each runner node will then run the tests with 100 users.

And that’s how you can use Terraform and Ansible to run your distributed Jmeter tests on DigitalOcean!

Check this page for more on string manipulation in bash.

Looking for DevOps & Infra Experts?

In case you’re looking for expertise in infrastructure related matters, I’d recommend to read our articles and ebooks on the topic, and to check out our various service pages:

An early draft of this article was written by Mate Boer, and then subsequently rewritten by Janos Kubisch – both engineers at RisingStack.

Node.js Async Best Practices & Avoiding the Callback Hell

In this post, we cover what tools and techniques you have at your disposal when handling Node.js asynchronous operations: async.jspromises, and async functions.

After reading this article, you’ll know how to use the latest async tools at your disposal provided by Node.js!

Node.js at Scale is a collection of articles focusing on the needs of companies with bigger Node.js installations and advanced Node developers. Chapters:

See all chapters of Node.js at Scale:

Asynchronous programming in Node.js

Previously we have gathered a strong knowledge about asynchronous programming in JavaScript and understood how the Node.js event loop works.

If you have not read these articles, I highly recommend them as introductions!

The Problem with Node.js Async

Node.js itself is single-threaded, but some tasks can run in parallel thanks to its asynchronous nature.

But what does running in parallel mean in practice?

Since we program a single-threaded VM, it is essential that we do not block execution by waiting for I/O, but handle operations concurrently with the help of Node.js’s event-driven APIs.

Let’s take a look at some fundamental patterns, and learn how we can write resource-efficient, non-blocking code, with the built-in solutions of Node.js.

The Classical Approach – Callbacks

Let’s take a look at these simple async operations. They do nothing special, just fire a timer and call a function once the timer finished.

function fastFunction (done) {
  setTimeout(function () {
  }, 100)

function slowFunction (done) {
  setTimeout(function () {
  }, 300)

Seems easy, right?

Our higher-order functions can be executed sequentially or in parallel with the basic “pattern” by nesting callbacks – but using this method can lead to an untameable callback-hell.

function runSequentially (callback) {
  fastFunction((err, data) => {
    if (err) return callback(err)
    console.log(data)   // results of a
    slowFunction((err, data) => {
      if (err) return callback(err)
      console.log(data) // results of b
      // here you can continue running more tasks

Never use the nested callback approach for handling asynchronous Node,js operations!

Avoiding Callback Hell with Control Flow Managers

To become an efficient Node.js developer, you have to avoid the constantly growing indentation level, produce clean and readable code and be able to handle complex flows.

Let me show you some of the tools we can use to organize our code in a nice and maintainable way!

#1: Using Promises

There have been native promises in javascript since 2014, receiving an important boost in performance in Node.js 8. We will make use of them in our functions to make them non-blocking – without the traditional callbacks. The following example will call the modified version of both our previous functions in such a manner:

function fastFunction () {
  return new Promise((resolve, reject) => {
    setTimeout(function () {
      console.log('Fast function done')
    }, 100)

function slowFunction () {
  return new Promise((resolve, reject) => {
    setTimeout(function () {
      console.log('Slow function done')
    }, 300)

function asyncRunner () {
    return Promise.all([slowFunction(), fastFunction()])

Please note that Promise.all will fail as soon as any of the promises inside it fails.

The previous functions have been modified slightly to return promises. Our new function, asyncRunner, will also return a promise, that will resolve when all the contained functions resolve, and this also means that wherever we call our asyncRunner, we’ll be able to use the .then and .catch methods to deal with the possible outcomes:

  .then(([ slowResult, fastResult ]) => {
    console.log('All operations resolved successfully')
  .catch((error) => {
    console.error('There has been an error:', error)

Since node@12.9.0, there is a method called promise.allSettled, that we can use to get the result of all the passed in promises regardless of rejections. Much like Promise.all, this function expects an array of promises, and returns an array of objects that has a status of “fulfilled” or “rejected”, and either the resolved value or the error that occurred.

function failingFunction() {
  return new Promise((resolve, reject) => {
    reject(new Error('This operation will surely fail!'))

function asyncMixedRunner () {
    return Promise.allSettled([slowFunction(), failingFunction()])

    .then(([slowResult, failedResult]) => {
        console.log(slowResult, failedResult)

In previous node versions, where .allSettled is not available, we can implement our own version in just a few lines:

function homebrewAllSettled(promises) {
  return Promise.all(promises.map((promise) => {
    return promise
      .then((value) => {
        return { status: 'fulfilled', value }
      .catch((error) => {
        return { status: 'rejected', error }

Serial task execution

To make sure your tasks run in a specific order – maybe successive functions need the return value of previous ones, or depend on the run of previous functions less directly – which is basically the same as _.flow for functions that return a Promise. As long as it’s missing from everyone’s favorite utility library, you can easily create a chain from an array of your async functions:

function serial(asyncFunctions) {
    return asyncFunctions.reduce(function(functionChain, nextFunction) {
        return functionChain.then(
            (previousResult) => nextFunction(previousResult)
    }, Promise.resolve());

serial([parameterValidation, dbQuery, serviceCall ])
   .then((result) => console.log(`Operation result: ${result}`))
   .catch((error) => console.log(`There has been an error: ${error}`))

In case of a failure, this will skip all the remaining promises, and go straight to the error handling branch. You can tweak it some more in case you need the result of all of the promises regardless if they resolved or rejected.

function serial(asyncFunctions) {
    return asyncFunctions.map(function(functionChain, nextFunction) {
        return functionChain
            .then((previousResult) => nextFunction(previousResult))
            .then(result => ({ status: 'fulfilled', result }))
            .catch(error => ({ status: 'rejected', error }));
    }, Promise.resolve());

Converting callback functions to promises

Node also provides a handy utility function called “promisify”, that you can use to convert any old function expecting a callback that you just have to use into one that returns a promise. All you need to do is import it in your project:

const promisify = require('util').promisify;
function slowCallbackFunction (done) {
  setTimeout(function () {
  }, 300)
const slowPromise = promisify(slowCallbackFunction);

  .then(() => {
    console.log('Slow function resolved')
  .catch((error) => {
    console.error('There has been an error:', error)

It’s actually not that hard to implement a promisify function of our own, to learn more about how it works. We can even handle additional arguments that our wrapped functions might need!

function homebrewPromisify(originalFunction, originalArgs = []) {
  return new Promise((resolve, reject) => {
    originalFunction(...originalArgs, (error, result) => {
      if (error) return reject(error)
      return resolve(result)

We just wrap the original callback-based function in a promise, and then reject or resolve based on the result of the operation.

Easy as that!

For better support of callback based code – legacy code, ~50% of the npm modules – Node also includes a callbackify function, essentially the opposite of promisify, which takes an async function that returns a promise, and returns a function that expects a callback as its single argument.

const callbackify = require('util').callbackify
const callbackSlow = callbackify(slowFunction)

callbackSlow((error, result) => {
  if (error) return console.log('Callback function received an error')
  return console.log('Callback resolved without errors')

#2: Meet Async – aka how to write async code in 2020

We can use another javascript feature since node@7.6 to achieve the same thing: the async and await keywords. They allow you to structure your code in a way that is almost synchronous looking, saving us the .then chaining as well as callbacks:

const promisify = require('util').promisify;

async function asyncRunner () {
    try {
      const slowResult = await promisify(slowFunction)()
      const fastResult = await promisify(fastFunction)()
      console.log('all done')
      return [
    } catch (error) {

This is the same async runner we’ve created before, but it does not require us to wrap our code in .then calls to gain access to the results. For handling errors, we have the option to use try & catch blocks, as presented above, or use the same .catch calls that we’ve seen previously with promises. This is possible because async-await is an abstraction on top of promises – async functions always return a promise, even if you don’t explicitly declare them to do so.

The await keyword can only be used inside functions that have the async tag. This also means that we cannot currently utilize it in the global scope.

Since Node 10, we also have access to the promise.finally method, which allows us to run code regardless of whether the promise resolve or rejected. It can be used to run tasks that we had to call in both the .then and .catch paths previously, saving us some code duplication.

Using all of this in Practice

As we have just learned several tools and tricks to handle async, it is time to do some practice with fundamental control flows to make our code more efficient and clean.

Let’s take an example and write a route handler for our web app, where the request can be resolved after 3 steps: validateParamsdbQuery and serviceCall.

If you’d like to write them without any helper, you’d most probably end up with something like this. Not so nice, right?

// validateParams, dbQuery, serviceCall are higher-order functions
function handler (done) {
  validateParams((err) => {
    if (err) return done(err)
    dbQuery((err, dbResults) => {
      if (err) return done(err)
      serviceCall((err, serviceResults) => {
        done(err, { dbResults, serviceResults })

Instead of the callback-hell, we can use promises to refactor our code, as we have already learned:

// validateParams, dbQuery, serviceCall are higher-order functions
function handler () {
  return validateParams()
    .then((result) => {
      return result

Let’s take it a step further! Rewrite it to use the async and await keywords:

// validateParams, dbQuery, serviceCall are thunks
async function handler () {
  try {
    await validateParams()
    const dbResults = await dbQuery()
    const serviceResults = await serviceCall()
    return { dbResults, serviceResults }
  } catch (error) {

It feels like a “synchronous” code but still doing async operations one after each other.

Essentially, a new callback is injected into the functions, and this is how async knows when a function is finished.

Takeaway rules for Node.js & Async

Fortunately, Node.js eliminates the complexities of writing thread-safe code. You just have to stick to these rules to keep things smooth:

As a rule of thumb, prefer async, because using a non-blocking approach gives superior performance over the synchronous scenario, and the async – await keywords gives you more flexibility in structuring your code. Luckily, most libraries now have promise based APIs, so compatibility is rarely an issue, and can be solved with util.promisify should the need arise.

If you have any questions or suggestions for the article, please let me know in the comments!

In case you’re looking for help with Node.js consulting or development, feel free to reach out to us! Our team of experienced engineers is ready to speed up your development process, or educate your team on JavaScript, Node, React, Microservices and Kubernetes.

In the next part of the Node.js at Scale series, we take a look at Event Sourcing with Examples.

This article was originally written by Tamas Hodi, and was released on 2017, January 17. The revised second edition was authored by Janos Kubisch and Tamas Hodi and it was released on 2020 February 10.

Mammogram Analysis with AI and User-Friendly Interfaces

We are excited to share a new project we have been working on in collaboration with Hungary’s leading medical research university – Semmelweis. This project focuses on using artificial intelligence and image recognition technologies to improve the accuracy and efficiency of breast cancer screenings.

The Power of AI in Detecting Breast Cancer

Early detection and prevention are crucial in the fight against breast cancer, and recent advancements in technology have made it possible for healthcare workers to receive computer assistance in examining mammograms and identifying problematic areas. The integration of machine learning and image recognition technologies in the medical field has the potential to revolutionize breast cancer screening, making it more accurate and efficient. 

However, the widespread adoption of these artificial intelligence-based solutions will not be possible without good products with great user experience. A user-friendly interface will make it easier for healthcare workers to use these technologies and improve patient outcomes, making it a crucial component in the fight against breast cancer.

As a recognized leader in medical research, Semmelweis University has a long-standing reputation for producing cutting-edge advancements in the field. We are proud to have the opportunity to partner with such an esteemed institution and contribute to their ongoing efforts to improve medical outcomes and advance the field of medicine.

Implementing AI with a User-Friendly Interface

The primary objective of this project was to make the workings of the algorithm more visually accessible to medical professionals. The goal was to design a platform that could be run on any device with network connectivity, either through a standalone application or using Docker technology. This would allow us to demonstrate the algorithm’s capabilities on-site, making it easier for those considering its use to get a hands-on understanding of its capabilities. 

During the image processing and annotation stage, the application provides real-time feedback in the form of a visual animation and progress bar. This helps users keep track of the analysis as it progresses and gives them a sense of the speed and efficiency of the algorithm. Once all the images have been processed, the application highlights those that show an increased risk, based on the annotations, in an interactive gallery. This gallery provides a clear and easy-to-understand representation of the algorithm’s results, making it a valuable tool for both users and potential adopters.

The software is intentionally slowed down for presentation purposes. It’s much, much faster, of course.

The Algorithm: Using Faster-RCNN and VGG16

The image detection algorithm at the core of this project uses a state-of-the-art region-based deep convolutional neural network called Faster-R-CNN. This powerful model was specifically designed for object detection and proved to be an effective tool for identifying problematic regions in mammograms. The base network used in the model was VGG16, a highly regarded 16-layer deep convolutional neural network that can be easily obtained from the PyTorch website. To make the algorithm even more effective, we further trained it to detect two different types of objects in mammogram images: benign and malignant lesions. 

The output of the algorithm is not just a simple diagnosis, but a comprehensive report that includes a score reflecting the confidence level in the diagnosis for each detected lesion. The algorithm also generates a modified image that clearly highlights the locations of the detected lesions by overlaying bounding boxes on the original mammogram. This makes it easy for healthcare workers to understand and interpret the results of the analysis.

The Backend – Frontend Integration

The backend is responsible for managing the running of the algorithm and ensuring that the results are promptly sent to the frontend once the analysis of an image is completed. The input images are first sent to the frontend, where they are overlaid with a scanline animation, providing a visual indication that the analysis is underway. As soon as the results are available, they are displayed on the frontend as a simple red/green overlay and a small animation before transitioning to the next image. 

To avoid any potential performance issues, the algorithm is run in serial for all images, as running it in parallel would quickly cause problems when using only the CPU and system memory. However, the CUDA Python SDK provides the ability to automatically use the CPU if a dedicated GPU cannot be found, making it possible to use the algorithm even on basic devices, albeit with reduced efficiency.

When a suitable nVidia GPU is available, the algorithm can be run in larger batches, providing much faster results. To get the results back to the frontend, we used Socket.io, as it allows for real-time communication between the backend and frontend, allowing us to directly push data from the backend to the frontend as soon as the algorithm finishes processing an image.

The images that have a confidence score below a certain threshold are considered “negative,” indicating that they are likely healthy. These images are presented in a distinctive way, with a small scale-out-scale-in animation. This animation is achieved through the use of CSS animation, utilizing the scale transform function. 

Deploying the Application with Docker

The entire application is packaged in a docker image, making it more accessible and easier to run and distribute. This approach has a number of advantages, one of which is the ability to deploy the application to a cloud service, which opens up the possibility of accessing it from anywhere with an internet connection. 

However, it is important to consider the architecture of the system you will run the application on, as this can impact which base image you will choose. For example, on M1 Macs, arm64 images are required, and attempting to run other images may result in errors. By utilizing a docker image, the project benefits from the portability and compatibility provided by this technology, allowing for seamless deployment and usage across different platforms.

Node, Express and React under the hood

In the implementation of this project, we chose to utilize a combination of Node.js and Express for the backend and React for the frontend. This choice was made based on the strengths and capabilities of these technologies, which provided the ideal foundation for the application’s needs. However, it is worth noting that this design may not be the only possible solution, and the application could also be implemented as an Electron app, which is a popular framework for creating desktop applications. This versatility highlights the flexibility of the project and its ability to adapt to different environments and technologies. The key is to find the right tool for the job, and in this case, we found that Node.js, Express, and React provided the optimal solution for our needs.

How RisingStack can help with your AI project

As businesses and corporations look to harness the power of Artificial Intelligence, there’s a growing demand for software development companies that can help implement these AI models and create web-based user interfaces to accompany them. That’s where RisingStack can help you. 

In the past couple of months, we created several custom AI solutions for businesses and institutions of all sizes, just to name a few:

  • Using AI to automatically generate product names and descriptions for webshop engines.
  • Creating easy-read text for children with disabilities
  • Sentiment analysis and automatic answers in the hospitality industry.
  • Pinpointing breast cancer using neural networks.

Whether you’re looking to create a custom AI model to help streamline your business processes, or you’re looking to build a web-based UI that provides users with a more engaging and interactive experience, we have the skills and expertise to help you achieve your goals.

Get in touch with us to learn more about how we can help you implement AI models and create web-based UI-s that drive results for your business.

RisingStack News (formerly Microservice Weekly) – a hand-curated newsletter

Microservice Weekly, our newsletter about microservices had a great run – we’ve carefully curated the best sources we could find and sent out more than 160 issues to those who wanted to stay up-to-date in the complex world of microservices for years.

Here’s a sneak peek if you want to take a look.

It was more and more difficult for us to keep up the quality of the newsletters, so we knew we had to do something differently.

The RisingStack newsletter, our main one had a lot more subscriber base, but it was mainly focused on our own content and experiences. While it was great to see that so many people were interested in what we publish, we also wanted to offer something to those who’d want to read a wide variety of sources.

So we decided to combine the two – introducing RisingStack News!

If you are here, because you’re looking for the best content out there about microservices, you’ll find it from now on via RisingStack News, our main newsletter. But you can also see the best articles that our team hand-curates about Node.js, DevOps, and Kubernetes.

Make sure to subscribe if you’re interested – we hope to see you there!

Do your engineers do what you think they do?

As organizations grow and evolve, they need to adapt to the challenges of managing more and more people and respond to the changing landscape they operate in. However, growing the team without keeping developer satisfaction in mind can lead to long-time staff members becoming jaded and burned out. This leads to a high turnover rate as well as valuable knowledge becoming unavailable over time. 

Recently, we were approached by a leading European online travel agency who recognized these symptoms so common to scale-ups. Initially, they hired us to advise them on technical and architectural questions, but we ended up providing organizational and engineering process-related consultation too, which helped them improve their development processes and get rid of some of their unrecognized communication problems.

In the following case study we outline the work done by RisingStack, with the intention of helping IT decision makers through sharing our experiences.

Architectural Consulting

The first part of our assignment centered around the introduction of microservices as well as common modern practices such as automated testing improvements, CI/CD workflow, code reviews, and such. 

Whenever it comes to fundamentally overhauling an architecture, we need to look deeper to see what problems lurk behind these proposed changes and whether undertaking this shift is achievable. To uncover issues that might hinder our plans, we agreed with our client to thoroughly investigate the situation by: 

  1. Getting a full picture on the technical aspects of day-to-day development: The codebase, the development cycle, and the deployment process.
  2. Understanding the organization and the flow of information within it: 
    • How are the long-term development roadmaps created?
    • How are they translated into epics and stories?
    • What is the flow of information from where the idea of a new feature or general functionality is conceived to the actual development of it?
    • How do developed features get accepted?

First, we assessed the current state of the codebase and deployments. Both of them were generally well thought out, but two things stood out:

  1. The majority of lines changed in the past couple of years belonged to a handful of people despite the organization employing around 20 developers.
  2. A lot of code hasn’t been touched for years, while other parts of the code essentially solved the same problem that has been worked on in the recent past.

The first point hinted at the heavy reliance on silos in the organization, while the second one showed symptoms of poor communication.

Understanding How the Organization Works

Next, we started conducting interviews with all related parties, beginning with the C-level executives and directors. We needed to get a sense of the organizational structures we were dealing with. According to them, the company was split up into 5 departments.

  • Finance
  • Sales / Business Development
  • Marketing
  • Development
  • Customer service

As all five departments could submit development needs, they maintained the product’s roadmap together. Based on the roadmap, Product Owners supervised the planning of new features and functionalities. The plans were mostly created by the UX team. These were chunked up into development projects carried out by the development teams using the agile methodology. Teams were also aided by a Project Manager whose job was to ensure the development was progressing according to the plan. Or at least that was the shared belief of how the organization works.

However, the nicely drawn organizational diagrams seldom reflect the truth on the day-to-day reality of teamwork. To discover the actual, informal organizational structure we conducted interviews with every team, every single person within the company who had anything to do with the product. Beyond C-level execs and directors, we also interviewed product and project managers, tech leads, developers, QA engineers, and manual testers.

Uncovering the Informal Flow of Knowledge

During these interviews we were able to understand the history of this company’s development, and how growth affected their way of conducting business.

Initially the company functioned like a small startup. With only a couple of developers, people could just nudge each other in the office, ask for something to be done, and deliver features. If a request was overheard by another stakeholder, it might have been challenged or deprioritized. The company worked completely informally, yet very efficiently. No lengthy meetings, no unnecessary planning, just people deciding how to spend their time most effectively to achieve common goals. This worked perfectly while there were only a handful of stakeholders in the organization, however, as the company grew, it needed to get organized, and thus emerged the 5 departments. Still, the leaders of said departments could easily reach the developers if they needed something.

Our client’s product had key advantages compared to its Central European competitors, which made it the undisputed leader of the market in this geographical area. These specific features could only be developed by one person at a time, so the other developers were helping out where they could, to make themselves useful in the meantime.

The channels of favors were formalized, resulting in the requests that previously were mere favors becoming mandatory tasks, those who asked for these favors in turn became supervisors to developers. 

By 2017 the product was basically done, no development happened on its core, besides marketing campaigns or sales-deal related updates. However, these sales related requests started becoming more and more ambitious, requiring more and more work, thus our client started hiring more and more developers. This was necessary for another reason too: developers started leaving the company, leaving just a handful of old-time developers in the organization, and only 2 of them wrote code actively. The others got appointed into leadership roles. However, more people did not result in faster delivery times, and people had to crunch more and more to meet their targets, which were missed by bigger and bigger margins.

On the business side, as time went by since its founding, our client’s product quickly overgrew its initial national market. It needed to look for other areas in Central Europe. They could maintain growth by acquiring other businesses, thus innovation was not key for a couple of years to their success. However, through these acquisitions, the company’s application portfolio became so fragmented that it became unwieldy to manage: a large overhaul was necessary to unify these acquired products, which led to the need for more organized development. 

At this point, in 2019, they hired a consultant to help them rectify their processes and update their organizational structure, but as usual, they only got to the formal level: leaders were consulted, a nice structural diagram was drawn, development teams were created, but the informal channels of favors-later-task-requests remained.

Aligning the formal and informal way of working

While in theory there was a roadmap to follow, and there were people appointed for the supervision of different aspects of the work, the day-to-day reality was pretty far from the formally defined organization. It became clear that we needed to find the necessary missing pieces that prevented the reorganized structure from taking hold.

During the interviews, different people raised seemingly unrelated issues, but at the end three overarching themes emerged:

  1. The lack of knowledge of the actual application within the company
  2. The lack of a company-wide definition of done
  3. The exclusion of developers from setting deadlines

The first was a tricky one: because everyone had some sense of what the application was capable of, it was never stated explicitly, except for newcomers. When we got a tour of the application it was seemingly thorough and complete. However, during group interviews people kept correcting each other on specific edge cases they needed to handle. So while we got a static picture of the application, when we went back asking about specific capabilities, it became clear that the knowledge seems to be lost, mostly because the person responsible for a given functionality has already left. 

However, manual testers had a pretty good grasp of both the general functionalities and edge cases too. It has also turned out during the interviews that they maintained a sort of “User Manual” of the product, that was a couple of hundred pages long, and no-one ever read it. Thus, in order for this knowledge to be available during the planning of given functionalities, we suggested that the relevant manual testers should be a part of the planning and design process, so no contradicting and superfluous features would be commissioned.

The second turned out to be the missing keystone of the organizational shift. While each team, and each organizational unit had their own definition of done, there was no company wide one. This way, the development practices and expectations of different actors were completely out of sync with each other. This also meant that developers had initially no easy way of pushing back on those who used to ask for favors or later create tasks for them, as it would have required too much explaining. Not being able to push back on requests resulted in the old ways trumping the organizational structure. We advised all departments and business leaders to agree on a definition of done. 

Lastly, while everyone thought that developers were included in the planning of projects, this usually happened after the commitment has been made to partners or clients. Thus resulting in unrealistic expectations towards the developers, and not providing them enough room to make these on-off solutions scalable and maintainable. We advised to update the deal making process, to include consultation with Development before taking on any commitments. 

Did it work?

We followed up on the progress 3 months later. While the involvement of testers into planning resulted in some initial pushback from the department heads, they got included into the design process, which almost immediately resulted in better alignment between the teams, as finally, testers became officially appointed as “keepers of truth” regarding the product as it is, creating a mental connection for questions that seemed unanswerable before. 

To put it simply: no-one ever considered asking the question “how does this actually work now”, besides consulting the codebase, and even if it did, manual testers never came to their minds. This led to better information sharing and collaboration between the teams, as they were freed from the toil of investigating and rediscovering modes of operation of the application.

The definition of done was a more pronounced success, ss the head of development and the development teams have previously agreed on a development flow, which would have been a fairly standard one: 

development <-> unit testing <-> code review -> QA  -> done
⌃                                                       ⌃

This previously agreed upon flow was never implemented due to the shortcomings detailed above, but after our consultation, it finally came through.

Taking the time to explain other departments on the process, and having them accept that a feature is not simply done upon the first completion of the code, letting them understand the need for QA made it easier to push back on favors and allowed the organization to take a shape that was closer to the formal one.

However, this would not have been possible without reducing crunches. With bringing developers closer to the actual deals, consulting them on what can be done at the moment easily, and what needs extensive development, while incorporating these investigations and their current development schedule into the deal making process (and the added benefit that they could rely on the knowledge accumulated by manual testers), crunches became rarer, deadlines became more realistic and less often missed and the business side became more satisfied with the development’s work too.

Do you face the same issues?

In case you found yourself heavily nodding through reading our case study, then you hopefully know that you’re not alone with your problems, and that there are ways you can address them. Let us know in the comments about your experiences. In case you’re looking for consultants to solve your issues, feel free to reach out to us.

History of JavaScript on a Timeline

In the early 1990s, Brendan Eich was working on a project at Netscape Communications Corporation. He needed a scripting language for web pages that would be easy to use, so he created one himself. He called it JavaScript. And the rest, as they say, is history.

In this blog post, we’ll take a look at the history of JavaScript on a timeline. We’ll see how it has evolved over the years and what new features have been added along the way. So sit back and enjoy learning about one of the most popular programming languages in the world!

1994-1998: The Netscape era

  • On December 15, 1994, Netscape Communications Corporation released the Netscape Navigator 1.0 web browser.
  • Brendan Eich created the very first version of JavaScript, codenamed “Mocha”, then later (still internally) renamed to LiveScript
  • “Netscape and Sun announce JavaScript, the open, cross-platform object scripting language for enterprise networks and the internet”
  • Microsoft introduced JScript in Internet Explorer to compete with Netscape.
  • Netscape 2 was released with JavaScript 1.0
  • Netscape submitted JavaScript to Ecma International, as the starting point for a standard specification.
  • Official release of the first ECMAScript language specification.

1999-2007: The showdown of Internet Explorer VS Mozilla Firefox

  • Microsoft releases Internet Explorer 5, that uses even more proprietary technology than before.
  • ECMAScript 2: Editorial changes to align ECMA-262 with the standard ISO/IEC 16262
  • ECMAScript 3: do-while, regular expressions, new string methods (concat, match, replace, slice, split with a regular expression, etc.), exception handling, and more
  • Firefox is released to compete with Internet Explorer.
  • Jesse James Garrett released a white paper in which he coined the term Ajax.

2008-2012: Netscape died, and Google Chrome was created

  • Netscape Navigator: end of life
  • ECMAScript 4 is officially abandoned.
  • Google releases the Chrome browser, the fastest web browser at the time.
  • Node.js was created by Ryan Dahl
  • ECMAScript 5 (formerly ECMAScript 3.1), that adds a strict mode, getters and setters, new array methods, support for JSON, and more.
  • TypeScript: a language for application-scale JavaScript development

2013-2014: from ASM.js to WebAssembly

  • ASM.js has been released
  • React, a JavaScript library for building user interfaces
  • “Disable Javascript” option removed in Firefox 23
  • Facebook Launches Flow, Static Type Checker for JavaScript

2015-2020: the rise of Node.js

  • Introduction of the Node.js Foundation
  • ECMAScript 6 (ES2015) is released.
  • WebAssembly
  • Object.observe withdrawn from TC39
  • Microsoft Edge’s JavaScript engine to go open-source
  • ECMAScript 2016 Language Specification
  • ECMAScript 2017 Language Specification
  • ECMA TC39: “SmooshGate” was officially resolved by renaming flatten to flat
  • ECMAScript 2018 Language Specification
  • JavaScript is now required to sign in to Google
  • ECMAScript modules in Node.js
  • ECMAScript 2019 Language Specification
  • QuickJS JavaScript Engine

2020-2022: Deno is created and Internet Explorer is officially retired

  • Deno: initial release
  • ECMAScript 2020 Language Specification
  • ECMAScript 2021 Language Specification
  • Deno joins TC39
  • Internet Explorer 11 has retired and is officially out of support