33
votes

I'm trying to run a spark application using bin/spark-submit. When I reference my application jar inside my local filesystem, it works. However, when I copied my application jar to a directory in hdfs, i get the following exception:

Warning: Skip remote jar hdfs://localhost:9000/user/hdfs/jars/simple-project-1.0-SNAPSHOT.jar. java.lang.ClassNotFoundException: com.example.SimpleApp

Here's the command:

$ ./bin/spark-submit --class com.example.SimpleApp --master local hdfs://localhost:9000/user/hdfs/jars/simple-project-1.0-SNAPSHOT.jar

I'm using hadoop version 2.6.0, spark version 1.2.1

4
what did you finally decide here? Did you switch to YARN or find another workaround? Sanjiv, below, was pointing at a bug that seems peripherally relevant. Did you try --deploy-mode cluster ? Thanks, interesting bug if it's really a bug, and doesn't seem to have been directly submitted to JIRA. Perhaps check thisJimLohse

4 Answers

24
votes

The only way it worked for me, when I was using

--master yarn-cluster

10
votes

To make HDFS library accessible to spark-job , you have to run job in cluster mode.

$SPARK_HOME/bin/spark-submit \
--deploy-mode cluster \
--class <main_class> \
--master yarn-cluster \
hdfs://myhost:8020/user/root/myjar.jar

Also, There is Spark JIRA raised for client mode which is not supported yet.

SPARK-10643 :Support HDFS application download in client mode spark submit

1
votes

There is a workaround. You could mount the directory in HDFS (which contains your application jar) as local directory.

I did the same (with azure blob storage, but it should be similar for HDFS)

example command for azure wasb

sudo mount -t cifs //{storageAccountName}.file.core.windows.net/{directoryName} {local directory path} -o vers=3.0,username={storageAccountName},password={storageAccountKey},dir_mode=0777,file_mode=0777

Now, in your spark submit command, you provide the path from the command above

$ ./bin/spark-submit --class com.example.SimpleApp --master local {local directory path}/simple-project-1.0-SNAPSHOT.jar

-2
votes

Yes, it has to be a local file. I think that's simply the answer.