I would like to spin up a lot of tasks when doing my calculation but coalesce into a smaller set of partitions when writing to the table.
A simple example for a demonstration is given below, where repartition is NOT honored during the execution.
My expected output is that the map operation happens in 100 partitions and finally collect happens in only 10 partitions.
It seems Spark has optimized the execution by ignoring the repartition. It would be helpful if someone can explain how to achieve my expected behavior.
sc.parallelize(range(1,1000)).repartition(100).map(lambda x: x*x).coalesce(10).collect()