How do i compare the received record with previous record of same key in spark structured streaming. Can this be done using groupByKey and mapGroupWithState?
groupByKey(user)
mapGroupsWithState(GroupStateTimeout.NoTimeout)(updateAcrossEvents)
//Sample code from Spark Definitive Guide
There is one more question arising when we perform the above operations I don't think so sequence of record will be maintained as the record is received it will partitioned and stored across worker nodes and when we apply groupByKey shuffle happens and all records with same key will be in the same worker node, but doesn't maintain the sequence.