The topic contains 10 partitions that have messages generated every 3-to-4 seconds by various IoT devices. The key on the message is LocationId and DeviceId.The value is device related details.
The stream topology is deployed to 4 EC2 instances. The process must determine the latest update value from each of the devices and analyze for criticality.
What I am seeing is that since messages are distributed across multiple partitions, stream consumer sees older messages and they are not in order.
How do I determine the latest message for the specific key?
I am seeing following message behaviour on Kafka Cluster -
L1D1 at 1:00 am - critical=false (P1)
L2D2 at 1:00 am - critical=false (P1)
L1D1 at 1:02 am - critical=**true** (P2)
L2D2 at 1:05 am - critical=false (P1)
L1D1 at 1:03 am - critical=false (P2)
L2D2 at 1:03 am - critical=false (P1)
Notice that at 1:02 device D1 had a critical alert, but at 1:03 it wasn't. If processing messages by the stream is 1:03, 1:02 (any random order based on the partition)
How do I determine the latest message for specific device efficiently since the order is not guaranteed?