I have a hive parquet table (external table on top of s3) which contains 6k partition. In data exploration we want to view the sample data, lets say 1/2/10 record without performing any transformation or action.
Is there a way to restrict only one partition and limit/show n records instead of going through 6k partition(if cluster is small it will take huge amount of time to just print 10 rows). I thought about mapPartitionsWithIndex
but it still go through all partitions
def mpwi(index: Int, iter: Iterator[Row]): Iterator = {
if (index == 1) iter
else Iterator()
}