0
votes

I've been reading a few articles about using Kafka and Kafka Streams (with state store) as Event Store implementation.

  1. https://www.confluent.io/blog/event-sourcing-using-apache-kafka/
  2. https://www.confluent.io/blog/event-sourcing-cqrs-stream-processing-apache-kafka-whats-connection/

The implementation idea is the following:

  1. Store entity changes (events) in a kafka topic
  2. Use Kafka streams with state store (by default uses RethinkDB) to update and cache the entity snapshot
  3. Whenever a new Command is being executed, get the entity from the store execute the operation on it and continue with step #1

The issue with this workflow is that the State Store is being updated asynchronously (step 2) and when a new command is being processed the retrieved entity snapshot might be stale (as it was not updated with events from previous commands).

Is my understanding correct? Is there a simple way to handle such case with kafka?

3

3 Answers

1
votes

Is my understanding correct?

As far as I have been able to tell, yes -- which means that it is an unsatisfactory event store for many event-sourced domain models.

In short, there's no support for "first writer wins" when adding events to a topic, which means that Kafka doesn't help you ensure that the topic satisfies its invariants.

There have been proposals/tickets to address this, but I haven't found evidence of progress.

0
votes

Yes it's simple way.
Use key for Kafka message. Messages with the same key always* go the the same partition. One consumer can read from one or many portions, but two partitions can not be read by two consumer simultaneously.

Max count of working consumer is always <= count of partition for a topic. You can create more consumer but consumer will be backup nodes.

Something like example:   

    Assumption.   
    There is a kafka topic abc with partitions p0,p1.   
    There is consumer C1 consuming from p0, and consumer C2 consuming from p1. Consumers are working asynchronicity

        km(key,command) - kafka message.

 #Procedure creating message

        km(key1,add)   -> p0
        km(key2,add)   -> p1
        km(key1,edit)  -> p0
        km(key3,add)   -> p1
        km(key3,edit)  -> p1

#consumer C1 will read messages  km(key1,add), km(key1,edit) and order will be persist 
#consumer c2 will read messages  km(key2,add) , km(key3,add) ,km(key3,edit) 
0
votes

If you write commands to Kafka then materialize a view in KStreams the materialized view will be updated asynchronously. This helps you separate writes from reads so the read path can scale.

If you want consistent read-write semantics over your commands/events you might be better writing to a database. Events can either be extracted from the database into Kafka using a CDC connector (write-through) or you can write to the database and then to Kafka in a transaction (write-aside).

Another option is to implement long polling on the read (so if you write trade1.version2 then want to read it again the read will block until trade1.version2 is available). This isn't suitable for all use cases but it can be useful.

Example here: https://github.com/confluentinc/kafka-streams-examples/blob/4eb3aa4cc9481562749984760de159b68c922a8f/src/main/java/io/confluent/examples/streams/microservices/OrdersService.java#L165