I think Apache Kafka is the best solution to store events for Event Sourcing. The concept of Event Sourcing is very close and usually works together with a concept/practice named CQRS by Greg Young which I suggest you to research.
The term repository I am using in this answer is a repository in terms of Domain Driven Design as in Eric Evans book.
I think I know what is a reason of the confusion you have.
But for event sourcing, you most likely want to read the events for a given entity/aggregate ID.
The above of your question is in my opinion true. But I think that you wanted to express something different. You wanted to express something like this:
In event sourcing, when repository is asked to retrieve an object from its data source, the repository must retrieve all events that constitutes particular entity in every request to the repository. Then these events must be replayed to build up the object.
Is it what you want to express really? Because The sentence above is in my opinion false.
You do not need to rebuild an object every time you retrieve it.
In other words, you don't need to replay all the events that constitutes the object every time you retrieve the object from repository. You can play events on the object and store current version of the object in a different way, e.g. in cache or even better, both in cache and in kafka.
So let's talk an example. Let's say we have a track/lorry that is loaded and unloaded.
The main stream of the events will be operations - this will be 1st kafka topic in our app. This will be our source of truth as Jay Kreps usually says it his papers.
These are the events:
- track 1 is loaded with pigs
- track 2 is loaded with pigs
- track 2 is unloaded from pigs
- track 2 is loaded with sand
- track 1 is unloaded from pigs
- track 1 is loaded with flowers
The final result is that track 1 is filled with flowers and track 2 is filled with sand.
What you do is you read the events from this topic and populate your 2nd topic: trackUpdated. The events you stream into trackUpdated topic are following:
- track 1: pigs
- track 2: pigs
- track 2: nothing
- track 2: sand
- track 1: nothing
- track 1: flowers
At the same time, with every message consumed, you update current version of the truck in cache e.g. memcached. So memcache will be a direct source which repository use to retrive track object.
What is more you make the trackUpdated topic a compacted topic.
Read about compacted topics in Apache Kafka official docs. There is lot of interesing materials about it on Confluent blog and Linkedin Engineering blog (before Confluent company was started).
So because of the fact that trackUpdated is compated Kafka makes it look like this after some time:
- track 2: sand
- track 1: flowers
Kafka will do this if you use track ID as a key for all messages - read in the docs what message "keys" are. So you end up with 1 message for every track. If you find bug in your app you can replay operations topic to populate your cache and trackUpdated topic again. If your cache goes down you can use trackUpdated topic to populate your cache.
What do you think? Votes and comments highly welcomed.
Update:
(1) After some thoughts I have changed my mind that a quote of you is true. I find it false now. So I do **not(( think that for event sourcing, you most likely want to read the events for a given entity/aggregate ID.
When you find a bug in your code you want to replay all the events for all objects. It does not matter if there are 2 entities as in my simple example or there is 10M entities.
Event sourcing is not about retrieving all the events for particular entity. Event sourcing is about that you have your audit log of all events and you are able to replay them to rebuild your entities. You don't need to be able to rebuild single particular entity.
(2) Strongly advice to get familiar with some blog posts of Confluent and LinkedIn engineering blogs. The below was very interesing for me:
https://www.confluent.io/blog/making-sense-of-stream-processing/
https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying
Official Kafka Docs is a must too.