I am running pyspark job in AWS EMR cluster, the cluster details are as follows. one master instance (m5.2xlarge) five slave instances (m5.2xlarge-8 vCore, 32 GiB memory, EBS only storage EBS storage:200 GiB).
after I have submitted a pyspark job, it is failing with below error.
ExecutorLostFailure (executor 1 exited caused by one of the running tasks) Reason: Container killed by YARN for exceeding memory limits. 24.1 GB of 24 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead or disabling yarn.nodemanager.vmem-check-enabled because of YARN-4714.
below is the spark submit command.
spark-submit --deploy-mode cluster --master yarn --num-executors 2 --executor-cores 5 --executor-memory 21g --driver-memory 10g --conf spark.yarn.executor.memoryOverhead=3g --conf spark.yarn.submit.waitAppCompletion=false --conf spark.yarn.maxAppAttempts=100 --conf spark.executor.extraJavaOptions=-Xss3m --conf spark.driver.maxResultSize=3g --conf spark.dynamicAllocation.enabled=false
please provide a better parameter for no of executors, executor memory and no cores.