0
votes

We have all 6 machine, hdfs and yarn service on all node, 1 master and 6 slaves. And we install Spark on 3 machine, 1 master, 3 workers ( 1 node master + worker) . We know when --master spark://[host]:[port], the job will run only 3 node use standalone mode. And when use spark-submit --master yarn submit a jar, it's would use all 6 server cpu and memory or just use 3 spark worker node machine ? And if can run all 6 node, How left 3 server can know it's the Spark job?

Spark: 2.3.1 Hadoop: 2.7.3

1

1 Answers

0
votes

In yarn mode, spark-submit send resource allocation resource to yarn and the containers will be launched on different node managers based on resource availability.