Stream Version in Event Sourcing

Question

In Event Sourcing, you store all individual Domain Events that have happened for one Aggregate instance, known as Event Stream. Along with Event Stream you also store a Stream Version.

Should the version be related with each Domain Event, or it should be related with transactional changes (aka commands)?

Example:

Our current state of Event Store is:

aggregate_id | version | event
-------------|---------|------
1            | 1       | E1
1            | 2       | E2

A new command is executed in aggregate 1. This command produces two new events E3 and E4.

Approach 1:

aggregate_id | version | event
-------------|---------|------
1            | 1       | E1
1            | 2       | E2
1            | 3       | E3
1            | 4       | E4

With this approach optimistic concurrency can be done by storage mechanism using unique index but replaying the events until version 3 could leave the aggregate/system in a inconsistent state.

Approach 2:

aggregate_id | version | event
-------------|---------|-----
1            | 1       | E1
1            | 2       | E2
1            | 3       | E3
1            | 3       | E4

Replaying the events until version 3 leave the aggregate/system in a consistent state.

Thanks!

I still researching pros/cons of two approaches. Also checking which approach is being using in IDDD and other DDD books. — martinezdelariva

VoiceOfUnreason VoiceOfUnreason · Accepted Answer · 2016-07-25T18:44:17

Short answer: #1.

The write of events E3 and E4 should be part of the same transaction.

Notice that the two approaches don't really differ in the case you are concerned about. If your read in the first case can miss E4, then so can your read in the second case. In the use case where you are loading the aggregate to do a write; loading the first three events will tell you that the next version should be #4.

In the case of approach #1, attempting to write version 4 produces a unique constraint conflict; the command handler won't be able to tell whether the problem was a bad load of the data, or simply an optimistic concurrency failure, but in either case the result is no write, and the book of record is still in a consistent state.

In the case of approach #2, attempting to write version 4 doesn't conflict with anything. The write succeeds, and now you have E5 that is not consistent with E4. Bleah.

For references on schemas for event stores, you might consider reviewing:

My preferred schema, assuming that you are compelled to roll your own, separates the stream from the events.

stream_id    | sequence | event_id
-------------|----------|------
1            | 1        | E1
1            | 2        | E2

The stream gives you a filter (stream id) to identify the events you want, and an order (sequence) to ensure the events you read are in the same order as the events you write. But beyond that, it's kind of an artificial thing, a side effect of the way that we happened to choose our aggregate boundaries. So its role should be pretty limited.

The actual event data, that lives somewhere else.

event_id | data | meta_data | ...
--------------------------------------
E1       | ...  | ... | ...
E2       | ...  | ... | ...

If you need to be able to identify the events associated with a particular command, that's part of the event meta-data, not part of the stream history (see: correlationId, causationId).

Stream Version in Event Sourcing

4 Answers