I am trying to submit the following job to my cluster, with Spark 3.0.0 and Mesos 1.9.
./bin/spark-submit \
--name test2 \
--master mesos://master:7077 \
--deploy-mode cluster \
--class org.apache.spark.examples.SparkPi \
--conf spark.master.rest.enabled=true \
./examples/jars/spark-examples_2.12-3.0.0.jar 100
However, I have received the following error message.
I0916 21:26:23.155861 8587 fetcher.cpp:562] Fetcher Info: {"cache_directory":"/tmp/mesos/fetch/root","items":[{"action":"BYPASS_CACHE","uri":{"cache":false,"extract":true,"value":"/spark-3.0.0-bin-SparkFHE/examples/jars/spark-examples_2.12-3.0.0.jar"}}],"sandbox_directory":"/var/lib/mesos/slaves/b61fd963-8537-48f0-9eb6-e26f3aa97265-S0/frameworks/92ca9c69-72c9-43d1-828e-ecc8bac62eff-0000/executors/driver-20200916212624-0041/runs/46a1e00e-0c01-47b5-82f5-a46ba5237321","stall_timeout":{"nanoseconds":60000000000},"user":"root"} I0916 21:26:23.165118 8587 fetcher.cpp:459] Fetching URI '/spark-3.0.0-bin-SparkFHE/examples/jars/spark-examples_2.12-3.0.0.jar' I0916 21:26:23.165141 8587 fetcher.cpp:290] Fetching '/spark-3.0.0-bin-SparkFHE/examples/jars/spark-examples_2.12-3.0.0.jar' directly into the sandbox directory W0916 21:26:23.168915 8587 fetcher.cpp:332] Copying instead of extracting resource from URI with 'extract' flag, because it does not seem to be an archive: /spark-3.0.0-bin-SparkFHE/examples/jars/spark-examples_2.12-3.0.0.jar I0916 21:26:23.168941 8587 fetcher.cpp:618] Fetched '/spark-3.0.0-bin-SparkFHE/examples/jars/spark-examples_2.12-3.0.0.jar' to '/var/lib/mesos/slaves/b61fd963-8537-48f0-9eb6-e26f3aa97265-S0/frameworks/92ca9c69-72c9-43d1-828e-ecc8bac62eff-0000/executors/driver-20200916212624-0041/runs/46a1e00e-0c01-47b5-82f5-a46ba5237321/spark-examples_2.12-3.0.0.jar' I0916 21:26:23.168957 8587 fetcher.cpp:623] Successfully fetched all URIs into '/var/lib/mesos/slaves/b61fd963-8537-48f0-9eb6-e26f3aa97265-S0/frameworks/92ca9c69-72c9-43d1-828e-ecc8bac62eff-0000/executors/driver-20200916212624-0041/runs/46a1e00e-0c01-47b5-82f5-a46ba5237321' I0916 21:26:23.374958 8598 exec.cpp:164] Version: 1.9.0 I0916 21:26:23.387948 8614 exec.cpp:237] Executor registered on agent b61fd963-8537-48f0-9eb6-e26f3aa97265-S0 I0916 21:26:23.390528 8604 executor.cpp:190] Received SUBSCRIBED event I0916 21:26:23.391326 8604 executor.cpp:194] Subscribed executor on worker4 I0916 21:26:23.391512 8604 executor.cpp:190] Received LAUNCH event I0916 21:26:23.392763 8604 executor.cpp:722] Starting task driver-20200916212624-0041 I0916 21:26:23.409191 8604 executor.cpp:738] Forked command at 8622 20/09/16 21:26:25 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 20/09/16 21:26:25 WARN DependencyUtils: Local jar /var/lib/mesos/slaves/b61fd963-8537-48f0-9eb6-e26f3aa97265-S0/frameworks/92ca9c69-72c9-43d1-828e-ecc8bac62eff-0000/executors/driver-20200916212624-0041/runs/46a1e00e-0c01-47b5-82f5-a46ba5237321/spark.driver.supervise=false does not exist, skipping. Error: Failed to load class org.apache.spark.examples.SparkPi. 20/09/16 21:26:25 INFO ShutdownHookManager: Shutdown hook called 20/09/16 21:26:25 INFO ShutdownHookManager: Deleting directory /tmp/spark-0c04f617-9daf-4a4b-8efe-e7d48e1eb06f I0916 21:26:25.802945 8601 executor.cpp:1039] Command exited with status 101 (pid: 8622) I0916 21:26:26.809671 8619 process.cpp:935] Stopped the socket accept loop
Within the above error message, I noticed that spark.driver.supervise=false is referenced in the executor path when trying to load the jar files.
20/09/16 21:26:25 WARN DependencyUtils: Local jar /var/lib/mesos/slaves/b61fd963-8537-48f0-9eb6-e26f3aa97265-S0/frameworks/92ca9c69-72c9-43d1-828e-ecc8bac62eff-0000/executors/driver-20200916212624-0041/runs/46a1e00e-0c01-47b5-82f5-a46ba5237321/spark.driver.supervise=false does not exist, skipping.
I think the problem of failing to load the class is due to this incorrect reference.
Any suggestion?
Looking into the debug message of spark-submit, I found the following.
Spark config:
(spark.jars,file:/spark-3.0.0-bin-SparkFHE/examples/jars/spark-examples_2.12-3.0.0.jar)
(spark.driver.supervise,false)
(spark.app.name,test2)
**(spark.submit.pyFiles,)**
(spark.master.rest.enabled,true)
(spark.submit.deployMode,cluster)
(spark.master,mesos://master:7077)
Classpath elements:
I noticed that (spark.submit.pyFiles,) is empty. I didn't plan to use python. Not sure why this option is turned on.
Furthermore, I tried debugging in the function "def doSubmit(args: Array[String])" within SparkSubmit.scala.
I tried to print the args array.
for (arg <- args) { logWarning(s"doSubmit: $arg") }
Somehow the following is included --py-files without any value.