I have a cluster with two workers and one master.
To start master & workers I use the sbin/start-master.sh
and sbin/start-slaves.sh
in the master's machine. Then, the master UI shows me that the slaves are ALIVE (so, everything OK so far). Issue comes when I want to use spark-submit
.
I execute this command in my local machine:
spark-submit --master spark://<master-ip>:7077 --deploy-mode cluster /home/user/example.jar
But the following error pops up: ERROR ClientEndpoint: Exception from cluster was: java.nio.file.NoSuchFileException: /home/user/example.jar
I have been doing some research in stack overflow and Spark's documentation and it seems like I should specify the application-jar
of spark-submit
command as "Path to a bundled jar including your application and all dependencies. The URL must be globally visible inside of your cluster, for instance, an hdfs:// path or a file:// path that is present on all nodes." (as it indicates https://spark.apache.org/docs/latest/submitting-applications.html).
My question is: how can I set my .jar as globally visible inside the cluster? There is a similar question in here Spark Standalone cluster cannot read the files in local filesystem but solutions do not work for me.
Also, am I doing something wrong by initialising the cluster inside my master's machine using sbin/start-master.sh
but then doing the spark-submit
in my local machine? I initialise the master inside my master's terminal because I read so in Spark's documentation, but maybe this has something to do with the issue. From Spark's documentation:
Once you’ve set up this file, you can launch or stop your cluster with the following shell scripts, based on Hadoop’s deploy scripts, and available in SPARK_HOME/sbin: [...] Note that these scripts must be executed on the machine you want to run the Spark master on, not your local machine.
Thank you very much
EDIT: I have copied the file .jar in every worker and it works. But my point is to know if there is a better way, since this method makes me copy the .jar to each worker everytime I create a new jar. (This was one of the answers from the question of the already posted link Spark Standalone cluster cannot read the files in local filesystem )
--jars example.jar
when runningspark-submit
? – Oli--jars example.jar
after the whole command I wrote above it still gives me the same error (NoSuchFileException). Whereas if I do not give the above path and I write instead--jars example.jar
or--jars /home/user/example.jar
it gives me the error:Missing application resource
. – meisanspark-submit
gives me the errorMissing application resource.
(and offers me the options available withspark-submit
) – meisan