Event sourcing incremental int id

Question

I looked at a lot of event sourcing tutorials and all are using simple demos to focus on the tutorials topic (Event sourcing)

That's fine until you hit in a real work application something that is not covered in one of these tutorials :)

I hit something like this. I have two databases, one event-store and one projection-store (Read models) All aggregates have a GUID Id, what was 100% fine until now.

Now I created a new JobAggregate and a Job Projection. And it's required by my company to have a unique incremental int64 Job Id.

Now I'm looking stupid :) An additional issue is that a job is created multiple times per second! That means, the method to get the next number have to be really safe.

In the past (without ES) I had a table, defined the PK as auto increment int64, save Job, DB does the job to give me the next number, done.

But how can I do this within my Aggregate or command handler? Normally the projection job is created by the event handler, but that's to late in the process, because the aggregate should have the int64 already. (For replaying the aggregate on an empty DB and have the same Aggregate Id -> Job Id relation)

How should I solve this issue?

Kind regards

Here's a suggestion: use GUID as true ID and treat the integer ID as just another datum on the aggregate. Then create a chaser that would pick up aggregate creation events and generate new "assign ID" events for each of them. The chaser can own the sequence generator, and no bottleneck for anyone who doesn't care about the numeric ID. — Fyodor Soikin
And how do I get the next one? (Incremented by 1) because the number just exists inside the events. I would have to reply all aggregates and then get the biggest one, but this is obviously no possible solution. And like I said, multible creations per second. — SharpNoiZy
You really need a single source for your ids. The aggregate itself could do it, but then wouldn't be able to distribute it to other servers. Of course, a database could be that source, but I think that kinda defeats the purpose of making use of CQRS. You could write your own service, but it is very complicated because you're trying to write something that won't slow down your aggregate(s). I have found it's much better to address the need for an incremental ID--that need is simply at odds with distributed systems. — Peter Ritchie
@SharpNoiZy you have a service that can provide a sequence of course. The chaser owns exclusive access to that sequence, takes IDs from it, and publishes them as events. — Fyodor Soikin
What is the business requirement supposed to need this? Most of the time it comes from people with a little understanding of database and such but no architectural vision and are somewhat "dangerous" people who lead IT in the wrong directory by mixing their business requirement and their will to manage the project. Seriously take the time to discuss this with your colleague/manager/customer/whatever. — Boris Guéry

VoiceOfUnreason VoiceOfUnreason · Accepted Answer · 2017-01-17T15:48:07

In the past (without ES) I had a table, defined the PK as auto increment int64, save Job, DB does the job to give me the next number, done.

There's one important thing to notice in this sequence, which is that the generation of the unique identifier and the persistence of the data into the book of record both share a single transaction.

When you separate those ideas, you are fundamentally looking at two transactions -- one that consumes the id, so that no other aggregate tries to share it, and another to write that id into the store.

The best answer is to arrange that both parts are part of the same transaction -- for example, if you were using a relational database as your event store, then you could create an entry in your "aggregate_id to long" table in the same transaction as the events are saved.

Another possibility is to treat the "create" of the aggregate as a Prepare followed by a Created; with an event handler that responds to the prepare event by reserving the long identifier post facto, and then sends a new command to the aggregate to assign the long identifier to it. So all of the consumers of Created see the aggregate with the long assigned to it.

It's worth noting that you are assigning what is effectively a random long to each aggregate you are creating, so you better dig in to understand what benefit the company thinks it is getting from this -- if they have expectations that the identifiers are going to provide ordering guarantees, or completeness guarantees, then you had best understand that going in.

There's nothing particularly wrong with reserving the long first; depending on how frequently the save of the aggregate fails, you may end up with gaps. For the most part, you should expect to be able to maintain a small failure rate (ie - you check to ensure that you expect the command to succeed before you actually run it).

In a real sense, the generation of unique identifiers falls under the umbrella of set validation; we usually "cheat" with UUIDs by abandoning any pretense of ordering and pretending that the risk of collision is zero. Relational databases are great for set validation; event stores maybe not so much. If you need unique sequential identifiers controlled by the model, then your "set of assigned identifiers" needs to be within an aggregate.

The key phrase to follow is "cost to the business" -- make sure you understand why the long identifiers are valuable.

Event sourcing incremental int id

3 Answers

Application level

Infrastructure level (common to various applications)

Domain Level

Other classes...

Things that do not matter in this post

Things that I care in this post

Drawbacks