Microservice Patterns: Saga Choreography
Let’s learn Saga pattern practically
Hi all…After a long time, I brought you a very interesting article on a common topic on Microservice Patterns. You may already have heard of Microservice Saga pattern, right? If not, please let me brief in…
Saga is one of the most popular patterns in this context…
Context of Saga
As you know, microservice based application ecosystem is distributed in its nature. But have you thought of managing a transaction that spans multiple services. Simply, a flow where we call one service from another and keep going on…Have you thought of a way to to achieve data consistency across services? How to update the resources across the services?
Usually microservices are following Database per Service Pattern which is another microservice pattern. Each service independently manages its own database in this pattern context.
When we have a distributed transaction flow like shown above, we have to continuously update order status considering the processes happening in the order flow. Services are connected via multiple data sources. We must make sure that we are updating our data precisely to complete an order successfully.
Let’s dig into the flow for placing an order from an e-commerce platform.
Let’s start understanding the steps in the flow. I have only considered happy path.
1 — Place the order by a customer
2 — Create order by order service and proceed for payment
3 — Process payment by payment service and send the acknowledgement back to order service + proceed for order preparation
4 — Prepare order by restaurant service and send the acknowledgement back to order service + proceed for delivery
5–Arrange delivery for the order by delivery service and send the acknowledgement back to order service when delivery is completed
As you can see here, every time we process forward, we have to update order service about what we are doing right now. So, this is the nature of distributed transactions/flows.
This was a challenging use case and there are few concerns… 😮
- Maintain ACID: To guarantee the accuracy of a transaction, it needs to adhere to the principles of Atomicity, Consistency, Isolation, and Durability (ACID). Atomicity ensures that either all steps of a transaction are completed or none at all. Consistency ensures the transition of data from one valid state to another. Isolation ensures that concurrent transactions yield the same results as sequential transactions would. Finally, Durability ensures that committed transactions remain unaffected by system failures. In distributed transactions involving multiple services, maintaining the principles of ACID remains crucial.
- Manage transaction isolation level: It defines the extent of data visibility in a transaction when concurrent access by other services occurs. Put differently, when one microservice persists an object in the database while another request is reading the data, the question arises: should the service provide the old or the updated data?
SO, Saga came as a solution to address issues in these kind of distributed scenarios. 😎
Introduction to Saga
The Saga architecture pattern provides transaction management using a sequence of local transactions. Every individual local transaction in the flow, utilizes ACID to update the local database, simultaneously triggering an event that initiates the subsequent local transaction within the Saga. In the event of a local transaction failure, the Saga performs a sequence of compensating transactions designed to revert the changes made by the preceding successful local transactions.
This approach is asynchronous and follows an eventually consistent transactional model, resembling the structure of a typical microservices application architecture. In such a setup, a distributed transaction is achieved through a series of asynchronous transactions across interconnected microservices.
Asynchronous behavior is the major advantage in Saga pattern which made it popular compared to well known 2 phase commit pattern.
Approaches of Saga Implementation
There are 2 main ways to apply Saga in microservice context. I will brief them in this section.
1️⃣ Choreography based Saga
In a choreography-based saga, individual local transactions publish events that serve as triggers for other participants to carry out their respective local transactions.
2️⃣ Orchestration based Saga
In an orchestrated-based saga, a centralized saga orchestrator communicates with saga participants by sending command messages, instructing them to execute their local transactions.
Choreography based Saga
This will be implemented in event based nature. It follows Publisher / Subscriber messaging pattern. There is a Message Broker to manage the events shared between microservices. In the industry, Kafka which is know as a distributed event store and stream-processing platform, is used to adopt this type of Sagas. Each microservice is connected to one or more topics to share data. See the example below. Let me explain the flow.
Happy Path for order flow
1 — After an order is created and persisted into database, order service is sending an event with type ORDER_CREATED to payment updates topic.
2 — Payment service consumes from payment updates topic. When it receives the ORDER_CREATED event, it process the payment. After processing is done, it sends an event with type ORDER_PAID to restaurant updates topic for the initiation of order preparation. And also payment service sends ORDER_PAID event to order updates topic at the same time, to update the status of the order. Order service consumes order updates and do the rest of the steps.
3 — Restaurant service consumes from restaurant updates topic. When it receives the ORDER_PAID event, it process the order preparation. In this case the event may contain the order ID. Restaurant service may perform a GET request to fetch order details and process it. After processing is done, it sends an event with type ORDER_PREPARED to delivery updates topic to invoke delivery process. And also restaurant service sends the same ORDER_PREPARED event to order updates topic at the same time, to update the status of the order.
4 — Delivery service consumes from delivery updates topic. When it receives the ORDER_PREPARED event, it process the order delivery. After delivering is done, delivery service sends ORDER_DELIVERED event to order updates topic to update the status of the order.
5 — Order service consumes ORDER_DELIVERED event and execute relevant logic to notify the customer.
I hope you understood the flow. See, how simple is that 😎. But we must follow best practices when we work with message queues/brokers like Kafka.
📔 Bonus Note:
Let’s say we have used Kafka. Consumers are listening from a Kafka topic from different partitions from different offsets. Usually, as a practice we configure it to start from beginning or latest. We have to think of fault tolerance mechanism when we deal with Kafka. Let’s say we have configured topic to consumer latest events. What happens if our consumer is down and still messages are coming to topic??? 😮 When our consumer is back online, we start processing from latest offset. We may loose those messages right? We have to have to contingency plan for this. We can implement Redis based solution to store last committed offset when consumer failure happens, and start reading from that last stored offset.
Let’s get back to topic! 😃
Handle failures in Choreography
I have already explained how we manage happy path for order processing. But it is not the end! We must handle failures also, right? So, let’s imagine our payment is failed due to insufficient balance. How we proceed here in Sagas? See the flow given below.
Here, we sends OREDER_PAYAMENT_FAILED event to order updates topic. Then order service can consume it and update order status to cancelled/rejected state by executing some logic. So, the rest of the flow is not executed. Restaurant service even does not know what has happened because we send the update to restaurant topic only if the payment is successful.
Let’s imagine order preparation is failed since some items in the orders are out of stock. How we proceed here in Sagas? See the flow given below.
When the order preparation fails, restaurant service will send ORDER_PREPARATION_FAILED event to order updates topic. Then order service will update the order status. Now order service will send an event with type ORDER_REFUND to payment updates topic. When payment service consumes this event, it will send ORDER_REFUND event to order updates topic to update order status again since order is already paid before failure. Then order service will do the rest of the job. Here, Delivery service even does not know what has happened because we send the update to delivery topic only if the order preparation is successful.
Challenges 💥
Each solution comes with both pros and cons. So, these are some drawbacks of the pattern we discussed so far.
Difficulty in understanding the flow
- In choreography-based sagas, the flow of the saga is distributed among services, making it challenging to have a centralized and clear definition of the entire saga’s progression.
- This lack of a centralized view can lead to difficulties in understanding and debugging the overall business process.
Cyclic dependencies between services
- Cyclic dependencies, where one service depends on another in a circular manner (e.g., microservice 01 depends on microservice 02, and vice versa), can introduce complexities and potential issues.
- Managing dependencies becomes crucial to avoid circular references and ensure proper communication between services.
Risk of tight coupling
- Subscribing to all events that affect a service may lead to tight coupling, where services become highly dependent on each other’s internal implementation details.
- This can hinder the flexibility and independence of microservices, making it harder to modify or replace one service without affecting others.
In summary, while choreography-based sagas offer advantages in terms of simplicity and loose coupling, careful consideration is required when implementing them in complex microservices architectures. It’s crucial to weigh the benefits against the drawbacks and evaluate whether an alternative approach, such as orchestration-based sagas, may be more suitable for certain scenarios. Each architectural choice comes with trade-offs, and the decision should align with the specific requirements and characteristics of the system in question.
So, this is how we manage a distributed transaction using choreography based Saga. I have explained the flow using so many assumptions. This flow can be changed based on real scenarios. As an example, for another guy, topics can be different, message queue provider can be different, service order in the flow can be different and etc. Just get the idea! 😃