Integrating legacy and CQRS

RisingStack's services:

Sign up to our newsletter!

In this article:

The architecture pattern CQRS suggests an application structure that differs significantly from the approach commonly used in legacy applications. How can the two worlds still be integrated with each other?

The full name of the design pattern CQRS is Command Query Responsibility Segregation. This describes the core of the pattern to separate actions and queries of an application already on an architectural level. While the actions called commands change the state of the application, queries are responsible for reading the state and transferring it to the caller.

As they complement each other well, CQRS is often combined with the concepts DDD (domain-driven design) and event-sourcing. Events play an important role in this context, as they inform about the facts that have happened within the application. To learn about these concepts as well as their interaction, there’s a free brochure on DDD, event-sourcing and CQRS written by the native web that you might be interested in.

The consequent separation of commands as actions and events as reactions leads to asynchronous user interfaces, which confront the developer with special challenges. In this context, for example, the question of how to deal with (asynchronous) errors is interesting, if you don’t want to make the user wait regularly in the user interface until the event matching the command sent has been received.

Legacy systems rarely work according to CQRS

On the other hand, there are countless legacy applications that are practically always based on architecture patterns other than CQRS. The classic three-layer architecture with CRUD as the method for accessing data is particularly common. However, this often leads to unnecessarily complex, monolithic applications in which CRUD keeps getting cultivated, although it was not sufficient any more after a short period of time already.

Unfortunately, the integration possibilities with such applications are as expected: poor. Even web applications have often been developed without APIs, since no value has been attached to them and the technologies used have promoted the limited field of vision. From today’s point of view this seems irresponsible, but over the years and decades this has been an accepted procedure. The sad thing about it is that development towards networked applications and services has been going on for many years, but too many developers and companies have deliberately ignored them.

The price to pay for this are the legacy applications of today, which do not have any APIs and whose integration possibilities are practically non-existent. It can therefore be stated that a modern service-based architecture based on CQRS differs fundamentally from what has been implemented in most cases in the past. In addition, there is the lack of scalability of applications based on a three-tier architecture.

Developing in the greenfield

Unfortunately, legacy applications don’t just disappear into thin air, which is why in many cases you have to live with them and make arrangements. The only exception to this is greenfield development, in which an application is completely redeveloped from scratch, without having to take legacy sites into account. However, this strategy is dangerous, as the well-known entrepreneur Joel Spolsky describes in his extremely worth reading blog entry Things You Should Never Do, Part I.

In the actual case of a greenfield development, the question arises at best about the suitability or necessity of CQRS. A guide to this can be found at When to use CQRS?!. It is also necessary to clarify whether CQRS can be usefully supplemented with domain-driven design and event sourcing. At this point, however, the simple part ends already, because the scenario of a greenfield development is always simple – precisely because there are no dependencies in the past.

Already the simple case of the complete replacement of an existing system by a new development raises complicated questions when the new application is based on CQRS. In practice, the separation of commands and queries in CQRS often leads to a physical separation of the write and read sides, which corresponds to the use of two databases. While one contains normalized data and serves the purpose of ensuring consistency and integrity when writing, the other contains data that is optimized for reading, i.e. denormalized data.

If you want to replace an existing application, you have to think about how to migrate the legacy data. It is obvious that this is not easy when switching from a CRUD-based, classic, relational database to two databases, each fulfilling a specific task. It is therefore necessary to analyze the existing data in detail, structure it and then decide how it can be mapped to the new databases without having to compromise on CQRS.

The database as an integration point

However, it becomes really difficult when the old and the new application have to coexist in parallel and have to be integrated with each other because, for example, a replacement is only to take place gradually. Another reason for the scenario is the addition of another application to an existing application without the need to replace it at all. How can CQRS be integrated with legacy applications in these cases?

One obvious option is integration via the database. This can work for applications based on the classic CRUD model, but is inconvenient for CQRS, because the problem of different data storage is also relevant here. In this case, however, the comparison is even more difficult, since not only the existing semantics must be mapped to a new one, but the new one must also continue to work for the existing application.

In addition, there are general concerns that need to be mentioned independently of the architecture of the applications. This includes in particular side-effects regarding the referential integrity, which can quickly trigger a boomerang effect. In addition, the applications are actually only seemingly decoupled from each other, as the effects of future changes to the data schema are intensified. Another point that makes integration via the database more difficult is the lack of documentation of the extensive and complex schemata.

Moreover, since the database was rarely planned as an integration point, direct access to it usually feels wrong. After all, the user avoids all domain concepts, tests and procedures that are implemented in the application and are only available in the database as implicit knowledge. The procedure is therefore to be regarded as extremely fragile, particularly from a domain point of view.

Another point of criticism about an integration via the database is the lack of possibilities for applications to actively inform each other about domain events. This could only be solved with a pull procedure, but this can generally be regarded as a bad idea due to the poor performance and the high network load. In summary, it becomes clear that the integration of a CQRS application with a legacy application via the database is not a viable way.

APIs instead of databases

An alternative is integration via an API. As already explained, it can be assumed that very few legacy applications have a suitable interface. However, this does not apply to the new development. Here it is advisable to have an API from the beginning – anything else would be grossly negligent in the 21st century. Typically, such an API is provided as a REST interface based on HTTPS or HTTP/2. Pure, i.e. unencrypted HTTP, can be regarded as outdated for a new development.

If you add concerns such as OpenID Connect to such a Web API, authentication is also easy. This also provides an interface based on an open, standardized and platform-independent protocol. This simplifies the choice of technology, since the chosen technology only has to work for the respective context and no longer represents a systemic size.

With the help of such an API, commands can be easily sent to the CQRS application. Executing queries is also easy. The two operations correspond to HTTP requests based on the verbs POST and GET. The situation is much more difficult if, in addition to commands and queries, events also need to be supported. The HTTP API is then required to transmit push messages, but the HTTP protocol was never designed for this purpose. As a way out, there are several variants, but none of which works completely satisfactorily.

How to model an API for CQRS?

There are countless ways to model the API of a CQRS application. For this reason, some best practices that can be used as a guide are helpful. In the simplest case, an API with three endpoints that are responsible for commands, events and queries is sufficient.

The npm module tailwind provides a basic framework for applications based on CQRS. The approach used there can easily be applied to technologies other than Node.js, so that a cross-technology, compatible standard can be created.

For commands there is the POST route /command, which is only intended for receiving a command. Therefore, it acknowledges receipt with the HTTP status code 200, but this does not indicate whether the command could be processed successfully or not. It just arrived. The format of a command is described by the npm module commands-events.

A command has a name and always refers to an aggregate in a given context. For example, to perform a ping, the command could be called ping and refer to the aggregate node in the context network. In addition, each command has an ID and the actual user data stored in the data block. The user property is used to append a JWT token to enable authentication at command level. Metadata such as a timestamp, a correlation ID and a causation ID complete the format:

{
  "context": {
    "name": "network"
  },
  "aggregate": {
    "name": "node",
    "id": "85932442-bf87-472d-8b5a-b0eac3aa8be9"
  },
  "name": "ping",
  "id": "4784bce1-4b7b-45a0-87e4-3058303194e6",
  "data": {
    "ttl": 10000
  },
  "custom": {},
  "user": null,
  "metadata": {
    "timestamp": 1421260133331,
    "correlationId": "4784bce1-4b7b-45a0-87e4-3058303194e6",
    "causationId": "4784bce1-4b7b-45a0-87e4-3058303194e6"
  }
}

The route /read/:modelType/:modelName is used to execute queries, and it is also addressed via POST. The name of the resource to be queried and its type must be specified as parameters. For example, to get a list of all nodes from the previous example, the type would be list and the name would be nodes. The answer is obtained as a stream, with the answer in ndjson format. This is a text format in which each line represents an independent JSON object, which is why it can be easily parsed even during streaming.

Finally, the route /events is available for events, which must also be called via POST. The call can be given a filter, so that the server does not send all events. The ndjson format is also used here – in contrast to executing queries, the connection remains permanently open so that the server can transfer new events to the client at any time. The format of the events is similar to that of the commands and is also described by the module commands-events.

All these routes are bundled under the endpoint /v1 to have some versioning for the API. If you want to use websockets instead of HTTPS, the procedure works in a very similar way. In this case, too, the module tailwind describes how the websocket messages should be structured.

Selecting a transport channel

To transfer push data, the most sustainable approach is still long polling, but it is admittedly quite dusty. The concept of server-sent events (SSE) introduced with HTML5 solves the problem elegantly at first glance, but unfortunately there is no possibility to transfer certain HTTP headers, which makes token-based authentication hard if not impossible. In turn, JSON streaming works fine in theory and solves the problems mentioned above, but fails because today’s browsers do not handle real streaming, which, depending on the number of events, gradually leads to a shortage of available memory. The streams API promised for this purpose has been under development for years, and there is no end in sight.

Often, websockets are mentioned as an alternative, but they are only supported by newer platforms. Since this case is explicitly about integration with legacy applications, it is questionable to what extent they support the technology. Provided that the retrieval is carried out exclusively on the server side and a platform with good streaming options is available, JSON streaming is probably the best choice at present.

Irrespective of the type of transport chosen, the basic problem remains that access to the CQRS-based application can only be granted from the legacy application, since no API is available for the other way around. But even if you ignore this disadvantage, there are other factors that make the approach questionable: fragile connections that can only be established and maintained temporarily may cause data to be lost during offline phases. To prevent this, applications need a concept for handling offline situations gracefully. This, in turn, is unlikely to be expected in legacy applications.

A message queue as a solution?

Another option is to use a message queue, which is a common procedure for integrating different services and applications. Usually, it is mentioned as a disadvantage that the message queue would increase the complexity of the infrastructure by adding an additional component. In the present context, however, this argument only applies in exceptional cases, since CQRS-based applications are usually developed as scalable distributed systems that use a message queue anyway.

There are different protocols for message queues. For the integration of applications, AMQP (Advanced Message Queueing Protocol) is probably the most common solution, supported by RabbitMQ and others. As this is an open standard, there is a high probability of finding an appropriate implementation for almost any desired platform.

A big advantage of message queues is that the exchange of messages works bidirectionally. If an application can establish a connection, it can use the message queue as a sender and receiver, so that not only the legacy application can send messages to the new application, but also vice versa. Another advantage is that message queues are usually designed for high availability and unstable connections. They therefore take care of the repetition of a failed delivery and guarantee it to a certain extent.

From a purely technical point of view, message queues can therefore be regarded as the optimal procedure that solves all problems. However, this does not apply from a domain point of view, because this is where the real problems begin, which are completely independent of the underlying transport mechanism. Since two applications are to be integrated with each other, it is also necessary to integrate different data formats and, above all, different domain languages. For example, the legacy application can work with numeric IDs, while the CQRS application can work with UUIDs, which requires bidirectional mapping at the border between the systems.

Mapping contexts between applications

In the linguistic field, this can be particularly difficult if domain concepts are not only given different names, but are even cut differently. Finding a common language is already difficult in a small interdisciplinary team – how much more difficult is it if the modeling of the two languages takes place independently in different teams, separated by several years or decades? The real challenge is to coordinate the semantics of the two applications and to develop semantically suitable adapters.

This is done using context mapping, i. e. mapping one language to another at the border between two systems. Since the two systems are separate applications in this case, it makes sense to implement context mapping in adapters as independent processes between the applications. The use of a message queue then plays out its advantages, since neither the two applications nor the adapter need to know each other. It is sufficient if each of the three components involved has access to the message queue to be able to send and receive messages.

In simple cases, an adapter is nothing more than a process that responds to incoming messages by translating the attached data into the target domain language and sending a new message, in accordance with the if-this-then-that concept. In the case of long-lasting, stateful workflows, however, this procedure is not enough, since the decision which message to send can no longer be made on the basis of the incoming message alone. In addition, the history is also required, for example, to be able to place the received message in a context.

In this case, it is advisable to implement an adapter as a state machine, whereby the incoming messages are the triggers for different state transitions. However, this means that the adapter also has a persistence option and must be designed for high availability. When modeling states and transitions, complexity increases rapidly if all potential variants are considered.

In order to keep the complexity of the adapters manageable, it is advisable to initially only consider the regular case that the workflow is processed successfully and only recognize error states – without having to process them automatically. In the simplest case, it may be sufficient to send a message to an expert who can then take care of the state of the workflow by hand. It is always helpful to keep in mind that context mapping in other parts is a domain problem and not a technical problem, which should therefore be solved professionally.

Who knows the truth?

Finally, the question of who knows the ultimate truth and has the last word in case of doubt is a fundamental question. Do the data and processes of the existing application have priority, or is the CQRS application granted the sovereignty over the truth? If the CQRS application works with event-sourcing, it is advisable to give preference to it, since event-sourcing enables extremely flexible handling of the data, which is far superior to the existing CRUD approach.

However, it is not possible to answer the question in general terms, since this ultimately depends on the individual situation. In any case, however, it is important to consider the question of conflict resolution and to clarify how to deal with contradictions in data and processes. But that too, however, is a technical and not a technical problem.

In summary, message queues and APIs are the only way to integrate legacy and CQRS applications in a clean way. The major challenges are not so much technical, but rather domain issues in nature and can hardly be solved sustainably without the advice of the respective experts. The long time since the development of the legacy application may be aggravating. Hope can be given at this point that professionalism may be less subject to change than the technology used, although this depends very much on the domain in question.

This article is written by Golo Roden. The author’s bio:
“Founder and CTO of the native web. Prefers JS & Node.js, and has written the first German book on this topic, “Node. js & co.”. He works for various IT magazines, and manages several conferences.”

Share this post

Twitter
Facebook
LinkedIn
Reddit