I'm developing a beam pipeline for dataflow runner. I need the below functionality in my use case.
- Read input events from Kafka topic(s). Each Kafka message value derives
[userID, Event]pair. - For each
userID, I need to maintain aprofileand based on the currentEvent, a possible update to theprofileis possible. If theprofileis updated:- Updated
profileis written to output stream. - The next
Eventfor thatuserIDin the pipeline should refer to the updated profile.
- Updated
I was thinking of using the provided state functionality in Beam, without depending on an external key-value store for maintaining the user profile. Is this feasible with the current version of beam (2.1.0) and dataflow runner? If I understand correctly the state is scoped to the elements in a single window firing (i.e even for a GlobalWindow, the state will be scoped to the elements in a single firing of the window caused by a trigger). Am I missing something here?