4
votes

If I have the possibility to configure spark with a very high amount of memory - how much should I use?

Some people say that any more memory than 32 GB / executor will not be helpful, as JVM addresses can't be compressed).

Assuming I could have about 200 GB of memory for spark /node should I create an executor fore ach 32 GB RAM, i.e. have multiple executors per worker? Or is it better to have a really big amount of RAM per node?

1

1 Answers

2
votes

Ideally we should go with multiple executors with each executor around 32GB or less than that (i.e 16, 17, 18...) instead of going with one execuot with 200GB memory.

For better throughput, it was suggested to go with 3 to 5 cores per executor instead of 10 or 15 cores per executor(I/O issue). Considering that it's better to go with 32GB or less than so that each core will process around 5 to 6GB instead of 10 to 20GB.

Ref.

http://site.clairvoyantsoft.com/understanding-resource-allocation-configurations-spark-application/

https://github.com/vaquarkhan/vaquarkhan/wiki/How-to-calculate-node-and-executors-memory-in-Apache-Spark