4
votes

I am trying to run a simple Map/Reduce java program using spark over yarn (Cloudera Hadoop 5.2 on CentOS). I have tried this 2 different ways. The first way is the following:

YARN_CONF_DIR=/usr/lib/hadoop-yarn/etc/hadoop/; 
/var/tmp/spark/spark-1.4.0-bin-hadoop2.4/bin/spark-submit --class MRContainer --master yarn-cluster --jars /var/tmp/spark/spark-1.4.0-bin-hadoop2.4/lib/spark-assembly-1.4.0-hadoop2.4.0.jar  simplemr.jar

This method gives the following error:

diagnostics: Application application_1434177111261_0007 failed 2 times due to AM Container for appattempt_1434177111261_0007_000002 exited with exitCode: -1000 due to: Resource hdfs://kc1ltcld29:9000/user/myuser/.sparkStaging/application_1434177111261_0007/spark-assembly-1.4.0-hadoop2.4.0.jar changed on src filesystem (expected 1434549639128, was 1434549642191

Then I tried without the --jars:

YARN_CONF_DIR=/usr/lib/hadoop-yarn/etc/hadoop/; 
/var/tmp/spark/spark-1.4.0-bin-hadoop2.4/bin/spark-submit --class MRContainer --master yarn-cluster simplemr.jar

diagnostics: Application application_1434177111261_0008 failed 2 times due to AM Container for appattempt_1434177111261_0008_000002 exited with exitCode: -1000 due to: File does not exist: hdfs://kc1ltcld29:9000/user/myuser/.sparkStaging/application_1434177111261_0008/spark-assembly-1.4.0-hadoop2.4.0.jar .Failing this attempt.. Failing the application. ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: root.myuser start time: 1434549879649 final status: FAILED tracking URL: http://kc1ltcld29:8088/cluster/app/application_1434177111261_0008 user: myuser Exception in thread "main" org.apache.spark.SparkException: Application application_1434177111261_0008 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:841) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:867) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 15/06/17 10:04:57 INFO util.Utils: Shutdown hook called 15/06/17 10:04:57 INFO util.Utils: Deleting directory /tmp/spark-2aca3f35-abf1-4e21-a10e-4778a039d0f4

I tried deleting all the .jars from hdfs://users//.sparkStaging and resubmitting but that didn't help.

2

2 Answers

3
votes

The problem was solved by copying spark-assembly.jar into a directory on the hdfs for each node and then passing it to spark-submit --conf spark.yarn.jar as a parameter. Commands are listed below:

hdfs dfs -copyFromLocal /var/tmp/spark/spark-1.4.0-bin-hadoop2.4/lib/spark-assembly-1.4.0-hadoop2.4.0.jar /user/spark/spark-assembly.jar 

/var/tmp/spark/spark-1.4.0-bin-hadoop2.4/bin/spark-submit --class MRContainer --master yarn-cluster  --conf spark.yarn.jar=hdfs:///user/spark/spark-assembly.jar simplemr.jar
3
votes

if you are getting this error it means you are uploading assembly jars using --jars option or manually copying to hdfs in each node. i have followed this approach and it works for me .

In yarn-cluster mode, Spark submit automatically uploads the assembly jar to a distributed cache that all executor containers read from, so there is no need to manually copy the assembly jar to all nodes (or pass it through --jars). It seems there are two versions of the same jar in your HDFS.

Try removing all old jars from your .sparkStaging directory and try again ,it should work .