34
votes

I want to setup a small event sourcing lib. I read a few tutorials online, everything understood so far.

The only problem is, in these different tutorials, there are two different database strategies, but without any comments why they use the one they use.

So, I want to ask for your opinion. And important, why do you prefer the solution you choose.

  1. Solution is the db structure where you create one table for each event.

  2. Solution is the db structure where you create only one generic table, and save the events as serialized string to one column.

In both cases I'm not sure how they handle event changes, maybe they create a whole new one.

Kind regards

6

6 Answers

45
votes

I built my own event sourcing lib and I opted for option 2 and here's why.

  • You query the event stream by aggregate id not event type.
  • Reproducing the events in order would be a pain if they are all in different tables
  • It would make upgrading events a bit of pain

There is an argument to say you can store events on a per aggregate but that depends of the requirements of the project.

I do have some posts about how event streams are used that you may find helpful.

14
votes

Solution is the db structure where you create only one generic table, and save the events as serialized string to one column

This is by far the best approach as replaying events is simpler. Now my two cents on event sourcing: It is a great pattern, but you should be careful because not everything is as simple as it seems. In a system I was working on we saved the stream of events per aggregate but we still had a set of normalized tables, because we just could not accept that in order to get the latest state of an object we would have to run all the events (snapshots help but are not a perfect solution). So yes event sourcing is a fine pattern, it gives you a complete versioning of your entities and a full auditing log, and it should be used just for that, not as a replacement of a set of normalized tables, but this is just my two cents.

6
votes

I think best solution will be to go with #2. And even you can save your current state together with the related event at the same time if you use a transactional db like mysql.

I realy dont like and recommend the solution #1.

If your concern for #1 is about event versioning/upgrading; then declare a new class for each new change. Dont be too lazy; or be obsess with reusing. Let the subscribers know about changes; give them the event version.

If your concers for #1 is about something like querying/interpreting events; then later you can easily push your events to an nosqldb or eventstore at any time (from original db).

Also; the pattern I use for eventsourcing lib is something like that:

public interface IUserCreated : IEventModel
{

}

public class UserCreatedV1 : IUserCreated
{
    public string Email { get; set; }
    public string Password { get; set; }
}

public class UserCreatedV2 : IUserCreated
{
    // Fullname added to user creation. Wrt issue: OA-143

    public string Email { get; set; }
    public string Password { get; set; }
    public string FirstName { get; set; }
    public string LastName { get; set; }
}

public class EventRecord<T> where T : IEventModel
{
    public string SessionId { get; set; } // Can be set in emitter.
    public string RequestId { get; set; } // Can be set in emitter.
    public DateTime CreatedDate { get; set; } // Can be set in emitter.
    public string EventName { get; set; } // Extract from class or interface name.
    public string EventVersion { get; set; } // Extract from class name
    public T EventModel { get; set; } // Can be set in emitter.
}

public interface IEventModel { }

So; make event versioning and upgrading explicit; both in domain and codebase. Implement handling of new events in subscribers before deploying origin of new events. And; if not required, dont allow direct consuming of domain events from external subscribers; put an integration layer or something like that.

I wish my thoughts will be useful for you.

2
votes

I read about an event-sourcing approach that consists in:

  1. having two tables: aggregate and event;
  2. base on you use cases either:

    a. creates and registry on aggregate table, generating an ID, version = 0 and a event type and create an event on event table;

    b. retrieve from aggregate table, events by ID or event type, apply business cases and then update aggregate table (version and event type) and then create an event on event table.

although I this approach updates some fields on aggregate table, it leaves event table as append only and improves performace as you have the latest version of an aggregate in aggregate table.

1
votes

I would go with #2, and if you really want to have an efficient way of search via event type, I would just add an index on that column.

0
votes

Here are the two strategies to access the data about a subject involved in this case. 1) current state and 2) event sequencing. With current state we process the events but keep only the last state of the subject. With event sequencing we keep the events and rebuild the current state by processing the events every time we need the state. Event sequencing is more reliable as we can track everything that happened causing the current state but it's definitely not efficient. It's a common sense to keep also intermediate states (snapshots) not only the last one to avoid reprocessing all the events all the time. Now we have reliability and performance.

In crypto currencies there are the event sequencing and local snapshots - the local in the name is because blockchains are distributed and data are replicated.