0
votes

I'm trying to submit a Spark app from local machine Terminal to my Cluster. I'm using --master yarn-cluster. I need to run the driver program on my Cluster too, not on the machine I do submit the application i.e my local machine

I'm using

    bin/spark-submit 
--class com.my.application.XApp 
--master yarn-cluster --executor-memory 100m 
--num-executors 50 hdfs://name.node.server:8020/user/root/x-service-1.0.0-201512141101-assembly.jar 
1000

and getting error

Diagnostics: java.io.FileNotFoundException: File file:/Users/nish1013/Dev/spark-1.4.1-bin-hadoop2.6/lib/spark-assembly-1.4.1-hadoop2.6.0.jar does not exist

I can see in my service list ,

  • YARN + MapReduce2 2.7.1.2.3 Apache Hadoop NextGen MapReduce (YARN)
  • Spark 1.4.1.2.3 Apache Spark is a fast and general engine for
    large-scale data processing.

already installed.

My spark-env.sh in local machine

export HADOOP_CONF_DIR=/Users/nish1013/Dev/hadoop-2.7.1/etc/hadoop

Has anyone encountered similar before ?

1
If you're running it on the cluster, then your local settings are not relevant. You should check for the settings and the filesystem of the nodes of the cluster - mgaido
thank you, I'm not sure why it is then complaining about a local file ? - nish1013
Spark needs that jar to run. According to the configuration of your installation, that jar is assumed to be located in the folder you've said.You have two option: you can put the jar locally to all your cluster machines and configuring each of them properly or you can put it into HDFS. - mgaido
I added that jar to HDFS , where should I configure the location for that jar ? - nish1013
on the worker nodes of your cluster - mgaido

1 Answers

0
votes

I think the right command to call is like following:

bin/spark-submit --class com.my.application.XApp --master yarn-cluster --executor-memory 100m --num-executors 50 --conf spark.yarn.jars=hdfs://name.node.server:8020/user/root/x-service-1.0.0-201512141101-assembly.jar 1000

or you can add spark.yarn.jars hdfs://name.node.server:8020/user/root/x-service-1.0.0-201512141101-assembly.jar in your spark.default.conf file