2
votes

I would like to know how to set the number of cores to be used in a PySpark program.

I have been doing a bit of searching and have been unable to find a definitive answer.

2

2 Answers

2
votes

You can set it using --executor-cores with spark-submit or set it using sc.setExecutorEnv(key=None, value=None, pairs=None) in the code it self.

1
votes

You can use --executor-cores property to specify the number of cores to use while submitting application with spark-submit.

Below is an example:

./bin/spark-submit --class org.apache.spark.examples.SparkPi \
    --master yarn-cluster \
    --num-executors 3 \
    --driver-memory 4g \
    --executor-memory 2g \
    --executor-cores 1
    lib/spark-examples*.jar \
    10