CQRS EventStore Dispatcher Error Handling

Question

I was looking at 2 scenario's: A is ok, B not sure.

Scenario A: Simulate application restart after commit, before dispatch

Start EventStore
Commit change
Event not dispatched
Stop Event store
Start event store

De commited event is send again in step 5. This works fine and I see this also in the dispatcher code.

Scenario B: Simulate bus error

Start EventStore
Commit change 1
Exception in dispatcher
Commit change 2
Dispatch ok

In this case I cannot find the behavior and also wonder if it is a valid case: This could only happen if there was a bug in the bus code.

Are there trigger which will retry to dispatch or do I need to write code to handle this or is my reasoning faulty?

I am curious as to where you are getting the patterns you are using. I haven't heard some of these concepts as they relate to CQRS. A pointer would be welcome. Thanks! — Erick T

Jonathan Oliver Jonathan Oliver · Accepted Answer · 2011-04-11T11:42:17

Your assessment of Scenario A is correct, in a failure condition such as an application or machine restart/process termination, when the process starts up again it will discover the undispatched commits and push them to the dispatcher.

Scenario B is somewhat more tricky. The issue is that the EventStore is not a bus so the question of how to handle errors in the bus isn't something that cannot be handled in the EventStore itself. Furthermore, because there are a number of bus implementations, I don't want to couple the EventStore to any particular implementation. Some users may not even use a message bus; they may decide to use RPC calls instead.

The question that you really have then is, how should bus failures--and by extension, queue failures--be handled? The EventStore has an interface IPublishCommits. When an event is committed it's then pushed to a dispatcher. The dispatchers are simply responsible for marking an event as dispatched once it has been properly and successfully handled by the implementation of IPublishCommits.

The best way to handle transient bus and queue failures would be to implement the circuit breaker pattern in your IPublishCommits implementation that retries until things start working again. For bigger issues, such as serialization failures, you may want to log some kind of critical failure that will notify an administrator immediately. Again, the sticky problem in all of this is that the EventStore cannot know about all of the specifics of your situation.

CQRS EventStore Dispatcher Error Handling

1 Answers