I'm bit new to Spark and trying to understand few term. (Couldn't understand using online resources)
Please validate me first with below terms:
Executor: Its container or JVM process
which will be running on worker node or data node
. We can have multiple Executors per node.
Core: Its a thread within a container or JVM process
running on worker node or data node
. We can have multiple cores or threads per executor.
Please correct me If am wrong in above two concepts.
Questions:
- When ever we submit spark job, What does it means ? Are we handing
over our job to Yarn or resource manager which will assigning
resources to my
application or job
in cluster and execute that ? Its it correct understanding .. ? In command used to submit job in spark cluster, there is an option to set number of executors.
spark-submit --class <CLASS_NAME> --num-executors ? --executor-cores ? --executor-memory ? ....
So these number of executors + cores will be setting up per-node? If not then how can we set specific number of cores per node?