I'm using Spark in a YARN cluster (HDP 2.4) with the following settings:
- 1 Masternode
- 64 GB RAM (48 GB usable)
- 12 cores (8 cores usable)
- 5 Slavenodes
- 64 GB RAM (48 GB usable) each
- 12 cores (8 cores usable) each
- YARN settings
- memory of all containers (of one host): 48 GB
- minimum container size = maximum container size = 6 GB
- vcores in cluster = 40 (5 x 8 cores of workers)
- minimum #vcores/container = maximum #vcores/container = 1
When I run my spark application with the command spark-submit --num-executors 10 --executor-cores 1 --executor-memory 5g ...
Spark should give each executor 5 GB of RAM right (I set memory only to 5g due to some overhead memory of ~10%).
But when I had a look in the Spark UI, I saw that each executor only has 3.4 GB of memory, see screenshot:
Can someone explain why there's so less memory allocated?