I'm working on a spark project and i'm using a hadoop cluster of 3 nodes with the following configuration:
- 8cores and 16go of Ram (Namenode, Application Master, nodemanager and spark master and worker).
- 4cores and 8go of Ram (datanode, nodemanager and worker)
4cores and 4go of Ram (datanode, nodemanager and worker) so i'm using the following configuration :
pyspark --master yarn-client --driver-memory 3g --executor-memory 1g --num-executors 3 --executor-cores 1
What's the best amount of executor, memory and cores tu use All my cluster performance?