spark sql aggregation operation which shuffles data i.e spark.sql.shuffle.partitions 200(by default). what happens on performance when shuffle partition is greater than 200.
Spark uses a different data structure for shuffle book-keeping when the number of partitions is greater than 2000. so if number of partitions is near to 2000 then increase it to more than 2000.
but my question is what will be the behavior when shuffle partition is greater than 200(lets say 300).