Spark standalone cluster with a master and 2 worker nodes 4 cpu core on each worker. Total 8 cores for all workers.
When running the following via spark-submit (spark.default.parallelism is not set)
val myRDD = sc.parallelize(1 to 100000)
println("Partititon size - " + myRDD.partitions.size)
val totl = myRDD.reduce((x, y) => x + y)
println("Sum - " + totl)
It returns value 2 for partition size.
When using spark-shell by connecting to spark standalone cluster the same code returns correct partition size 8.
What can be the reason ?
Thanks.