I'm developing a beam pipeline for dataflow runner. I need the below functionality in my use case.
- Read input events from Kafka topic(s). Each Kafka message value derives
[userID, Event]
pair. - For each
userID
, I need to maintain aprofile
and based on the currentEvent
, a possible update to theprofile
is possible. If theprofile
is updated:- Updated
profile
is written to output stream. - The next
Event
for thatuserID
in the pipeline should refer to the updated profile.
- Updated
I was thinking of using the provided state functionality in Beam, without depending on an external key-value store for maintaining the user profile. Is this feasible with the current version of beam (2.1.0
) and dataflow
runner? If I understand correctly the state is scoped to the elements in a single window firing (i.e even for a GlobalWindow
, the state will be scoped to the elements in a single firing of the window caused by a trigger). Am I missing something here?