There has been some discussion on foreachRDD in Spark Streaming as per a good posting stackoverflow 36421619. None-the-less I feel the answers are not clear enough when reading the prose. So here goes ...
My questions are:
- When will foreachRDD ... return more than 1 RDD? With a sliding window over N batches?
- If we process simply per batch then there is one RDD it is stated, so?
The commonest use case appears to be for persisting to external storage I note. That seems to be the guide for most, output operations. I am somehow missing something.