I'm developing my insight about distributed systems, and how to maintain data consistency across such systems, where business transactions covers multiple services, bounded contexts and network boundaries.
Here are two approaches which I know are used to implement distributed transactions:
- 2-phase commit (2PC)
- Sagas
2PC is a protocol for applications to transparently utilize global ACID transactions by the support of the platform. Being embedded in the platform, it is transparent to the business logic and the application code as far as I know.
Sagas, on the other hand, are series of local transactions, where each local transaction mutates and persist the entities along with some flag indicating the phase of the global transaction and commits the change. In the other words, state of the transaction is part of the domain model. Rollback is the matter of committing a series of "inverted" transactions. Events emitted by the services triggers these local transactions in either case.
Now, when and why would one use sagas over 2PC and vice versa? What are the use cases and pros/cons of both? Especially, the brittleness of sagas makes me nervous, as the inverted distributed transaction could fail as well.