FileNotFoundException in apache spark(1.6) job during shuffle files

Question

I am working on spark 1.6, it is failing my job with following error

java.io.FileNotFoundException: /data/05/dfs/dn/yarn/nm/usercache/willir31/appcache/application_1413512480649_0108/spark-local-20141028214722-43f1/26/shuffle_0_312_0.index (No such file or directory) java.io.FileOutputStream.open(Native Method) java.io.FileOutputStream.(FileOutputStream.java:221) org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:123) org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:192) org.apache.spark.util.collection.ExternalSorter$$anonfun$writePartitionedFile$4$$anonfun$apply$2.apply(ExternalSorter.scala:733) org.apache.spark.util.collection.ExternalSorter$$anonfun$writePartitionedFile$4$$anonfun$apply$2.apply(ExternalSorter.scala:732) scala.collection.Iterator$class.foreach(Iterator.scala:727) org.apache.spark.util.collection.ExternalSorter$IteratorForPartition.foreach(ExternalSorter.scala:790) org.apache.spark.util.collection.ExternalSorter$$anonfun$writePartitionedFile$4.apply(ExternalSorter.scala:732) org.apache.spark.util.collection.ExternalSorter$$anonfun$writePartitionedFile$4.apply(ExternalSorter.scala:728) scala.collection.Iterator$class.foreach(Iterator.scala:727) scala.collection.AbstractIterator.foreach(Iterator.scala:1157) org.apache.spark.util.collection.ExternalSorter.writePartitionedFile(ExternalSorter.scala:728) org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:70) org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)

I am performing join operations. When i carefully look into the error and check my code i found it is failing while it is writing back to CSV from dataFrame. But i am not able to get rid of it. I am not using hdp, i have separate installation for all components.

It's funny that you have the exact same stacktrace as this guy back in 2014 apache-spark-user-list.1001560.n3.nabble.com/… — Kien Truong

zero323 zero323 · Accepted Answer · 2016-07-14T10:03:36

This types of errors typically occur when there are deeper problems with some tasks, like significant data skew. Since you don't provide enough details (please be sure to read How To Ask and How to create a Minimal, Complete, and Verifiable example) and job statistics the only approach that I can think off is to significantly increase number of shuffle partitions:

sqlContext.setConf("spark.sql.shuffle.partitions", 2048)

FileNotFoundException in apache spark(1.6) job during shuffle files

1 Answers