3
votes

I read an answer from What conditions should cluster deploy mode be used instead of client?,

(In client mode) You could run spark-submit on your laptop, and the Driver Program would run on your laptop.

Also, the Spark Doc says,

In client mode, the driver is launched in the same process as the client that submits the application.

Does it mean that I can submit spark tasks from any machine, as long as it can be reachable from master and has Spark environment?

Or in other words, can driver process run outside of the Spark cluster?

2

2 Answers

3
votes

Yes, the driver can run on your laptop. Keep in mind though:

  • The Spark driver will need the Hadoop configuration to be able to talk to YARN and HDFS. You could copy it from the cluster and point to it via HADOOP_CONF_DIR.
  • The Spark driver will listen on a lot of ports and expect the executors to be able to connect to it. It will advertise the hostname of your laptop. Make sure it can be resolved and all ports accessed from the cluster environment.
0
votes

Yes, I'm running spark-submit jobs over the LAN using option --deploy-mode cluster. Currently running into this issue however: the server response (json object) isn't very descriptive.