Saturday, April 18, 2020

Reactive Architecture: Distributed Messaging Patterns

According to the Reactive Manifesto, a critical requirement for any Reactive system is it to be message driven. Message driven systems communicate through asynchronous, non-blocking messages. Such messages help us to build systems that are both resilient, and elastic, and therefore responsive under a variety of situations. But when we choose to build systems using asynchronous, non-blocking messages, there are consequences.

Message Driven Architecture

In a message driven architecture, components communicate using asynchronous non-blocking messages. It means components send messages to other components, but don't wait for response and continue doing other tasks, and final response arrives asynchronously. This ensures that different parts of the system are not waiting on each other. This way of communication provides several key advantages:

  • Resources like threads, CPU cycles etc are not blocked
  • Reduces the contention, and hence improves scalability
  • If a component is down, messages to it can be queued and delivered later on, thus improving reliability of the system
Asynchronous messages must form the backbone of reactive systems. But it doesn't mean that synchronous messaging should never be used. We may need to use synchronous messages, like for acknowledging receipt of a message, even though message is processed asynchronously.

Asynchronous messaging is technically more challenging compared to synchronous messaging. In context of distributed systems, managing a transaction across multiple databases and microservices is difficult as keeping transaction open for a long period of time increases probability of failures and hence makes system brittle.

Saga

A saga is a way of representing a long running transaction. It encapsulates multiples requests, which can either be run in parallel or sequentially. When all requests complete successfully, saga is marked as success. Every request is paired with a compensating action. This is needed for failure scenario. If any of the request fails, compensating action is executed for all requests executed successfully so far, and saga is marked as failure. If a compensating action fails, we keep trying until it is a success.
If a request times out, it may be either that request failed or request succeeded but reply failed or request is queued. Hence we need to make sure that its compensating action doesn't have any negative impact if request itself failed.

Message Delivery Guarantee

In distribute systems, message delivery poses an interesting problem. We always strive for Exactly Once delivery.When component A sends message to component B, it can't be sure whether message was delivered or not. B can send back an acknowledgement, but this itself can be lost in network. So we are never sure of message deliveries. A can retry, but then it may lead to message duplication. This means that Exactly Once delivery guarantee is not possible.
We thus have to settle with following delivery guarantees:

At Most Once Delivery
This approach means that we deliver message just once, and we don't retry in case of failure. As a consequence, message is never duplicated. But as message may be lost, so there is no guarantee of its delivery. It means that receiver would get the message once or never. As we don't retry, so we don't need to store the message. Also it is easy to implement.

At Least Once Delivery
This approach guarantees that every message is eventually delivered. If there is a failure, we retry to send the message. Now failure can be either because message was not delivered or because acknowledgement from receiver failed. So retry may lead to duplication of message, but message is eventually delivered. This also means that we need a durable storage, like file system or database, for the message so that it's not lost.

Messaging Patterns

There are 2 messaging patterns when we build reactive systems.

Point To Point
Point to Point messaging pattern involves components calling each other directly. This means services know each other's API, and if any API is updated, dependent services need to be updated too. This means level of coupling is high, and complexities are more easily observable. Dependencies too are easily observable.

Publish/Subscribe
In this pattern, services are not aware of each other and hence they are not directly coupled. There is a message bus/broker. Services publish their messages to this bus, and other services subscribe to these messages. This means their is high level of decoupling between services, and services are only aware of the message format. We can easily add/replace or remove services without impacting others. Complexities involved are difficult to see. It is bit more difficult to find dependencies as they are not talking directly.

No comments:

Post a Comment