1
votes

UpdateStateByKey is useful but what if I want to perform an operation to all existing keys (not only the ones in this RDD).

Word count for example - is there a way to decrease all words seen so far by 1?

I was thinking of keeping a static class per node with the count information and issuing a broadcast command to take a certain action, but could not find a broadcast-to-all-nodes functionality.

1

1 Answers

1
votes

Spark will perform an updateStateByKey to all existing keys anyway.

Good to also note that if the updateStateByKey function returns None (in Scala) then the key-value pair will be eliminated.