In Learning Spark book they write:
For operations that act on a single RDD, such as reduceByKey(), running on a pre-partitioned RDD will cause all the values for each key to be computed locally on a single machine, requiring only the final, locally reduced value to be sent from each worker node back to the master.
However, in the this answer the author is saying, no pre-partitioning is needed because:
For reduceByKey(), the first quality aggregates elements of the same key with the provided associative reduce function locally first on each executor and then eventually aggregated across executors.
So, why does a book suggestes pre-partitioning if reduceByKey() will anyway aggregares elements on each executor first without shuffeling the data?