2
votes

I have this job running fine in YARN client mode, however in Cluster mode I get the following error.

Log Contents: Error: Could not find or load main class org.apache.spark.deploy.yarn.ApplicationMaster End of LogType:stderr

I have not set spark.yarn.jars or the spark.yarn.archive. However in the trace, I do see the spark-yarn jar getting uploaded. Is there any additional setting needed here ?

16/11/01 10:49:49 INFO yarn.Client: Uploading resource file:/etc/security/keytabs/spark.keytab -> hdfs://beixvz579:8020/user/sifsuser/.sparkStaging/application_1477668405073_0026/spark.keytab 16/11/01 10:49:50 INFO yarn.Client: Uploading resource file:/home/sifsuser/spark200/jars/spark-yarn_2.11-2.0.0.jar -> hdfs://beixvz579:8020/user/sifsuser/.sparkStaging/application_1477668405073_0026/spark-yarn_2.11-2.0.0.jar 16/11/01 10:49:50 INFO yarn.Client: Uploading resource file:/home/sifsuser/lib/sparkprogs.jar -> hdfs://beixvz579:8020/user/sifsuser/.sparkStaging/application_1477668405073_0026/sparkprogs.jar

2

2 Answers

3
votes

The jar is spark-yarn_2.11-2.4.0.jar(my version) which location is $SPARK_HOME/jars/

first step: (add this into spark-default.conf)

 spark.yarn.jars hdfs://hadoop-node1:9000/spark/jars/*

second step:

 hadoop fs -put $SPARK_HOME/jars/*  hdfs://hadoop-node1:9000/spark/jars/
-5
votes

After a lot of debugging, I found out this error was thrown due to a missing class that the ApplicationMaster was dependent on. In my case it was one of the logging jars that AM class is dependent on. After adding the additional jars, I can now submit jobs.