I was trying to follow the Spark standalone application example described here https://spark.apache.org/docs/latest/quick-start.html#standalone-applications
The example ran fine with the following invocation:
spark-submit --class "SimpleApp" --master local[4] target/scala-2.10/simple-project_2.10-1.0.jar
However, when I tried to introduce some third-party libraries via --jars
, it throws ClassNotFoundException
.
$ spark-submit --jars /home/linpengt/workspace/scala-learn/spark-analysis/target/pack/lib/* \
--class "SimpleApp" --master local[4] target/scala-2.10/simple-project_2.10-1.0.jar
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Exception in thread "main" java.lang.ClassNotFoundException: SimpleApp
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:300)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Removing the --jars
option and the program runs again (I didn't actually start using those libraries yet). What's the problem here? How should I add the external jars?