1
votes

I am trying to understand how spark job work in yarn cluster

I am using below commands to submit job

  1. spark-submit --master yarn --deploy-mode cluster sparksessionexample.py

After submitting job console shows below console log

2020-05-29 20:52:48,668 INFO yarn.Client: Uploading resource file:/tmp/spark-bcd415f0-a22e-46b2-951c-5b6e4385a0c6/__spark_libs__2908230569257238890.zip -> hdfs://localhost:9000/user/hadoop/.sparkStaging/application_1590759398715_0003/__spark_libs__2908230569257238890.zip
2020-05-29 20:53:14,164 INFO yarn.Client: Uploading resource file:/home/hadoop/pythonprojects/Python/src/spark_jobs/sparksessionexample.py -> hdfs://localhost:9000/user/hadoop/.sparkStaging/application_1590759398715_0003/sparksessionexample.py
2020-05-29 20:53:14,610 INFO yarn.Client: Uploading resource file:/home/hadoop/clouderaapp/apache-spark/python/lib/pyspark.zip -> hdfs://localhost:9000/user/hadoop/.sparkStaging/application_1590759398715_0003/pyspark.zip
2020-05-29 20:53:15,984 INFO yarn.Client: Uploading resource file:/home/hadoop/clouderaapp/apache-spark/python/lib/py4j-0.10.7-src.zip -> hdfs://localhost:9000/user/hadoop/.sparkStaging/application_1590759398715_0003/py4j-0.10.7-src.zip
2020-05-29 20:53:18,362 INFO yarn.Client: Uploading resource file:/tmp/spark-bcd415f0-a22e-46b2-951c-5b6e4385a0c6/__spark_conf__7123551182035223076.zip -> hdfs://localhost:9000/user/hadoop/.sparkStaging/application_1590759398715_0003/__spark_conf__.zip

I just to understand how yarn execute sparksessionexample.py file, i mean whether it create python virtual env on node? as above log shows only uploading lib, confs but what about python client to execute sparksessionexample.py?

Can anyone help understand this?

1

1 Answers

0
votes

The "Spark client" is used to bootstrap the Spark job execution.

In your case it is the only thing that runs on your local machine, because you requested cluster execution mode:

  • the "client" contacts the cluster manager (here YARN Resource Manager, could be Kubernetes Master, etc.) to start the Spark driver inside an AppMaster container
  • then the driver contacts again the cluster manager to request some containers for the executors
  • then the driver runs your Python code and distributes the work to the executors
  • finally the driver de-allocates its executors and itself
  • at this point the "client" notices that the YARN job has reached success or failure status, and can terminate

In short, the "client" never gets any kind of useful information from the driver running inside the cluster. You must inspect the YARN logs for the container running the driver (it's the AppMaster, typically number 00001).


client execution mode