0
votes

I'm trying to squeeze every single bit from my cluster when configuring the spark application but it seems I'm not understanding everything completely right. So I'm running the application on an AWS EMR cluster with 1 master and 2 core nodes from type m3.xlarge(15G ram and 4 vCPU for every node). This means that by default 11.25 GB are reserved on every node for applications scheduled by yarn. So the master node is used only by the resource manager(yarn) and that means the remaining 2 core nodes will be used to schedule applications(so we have 22.5G for that purpose). So far so good. But here comes the part which I don't get. I'm starting the spark application with the following parameters:

--driver-memory 4G --num-executors 4 --executor-cores 7 --executor-memory 4G

What this means by my perceptions(from what I found as information) is that for the driver will be allocated 4G and 4 executors will be launched with 4G every one of them. So a rough estimate makes it 5*4=20G(lets make them 21G with the expected memory reserves), which should be fine as we have 22.5G for applications. Here's a screenshot from the UI of the hadoop yarn after the launch:

enter image description here

What we can see is that 17.63 are used by the application but this a little bit less than the expected ~21G and this triggers the first question- what did happen here?

Then I go to the spark UI's executors page. Here comes the bigger question:

enter image description here

The executors are 3(not 4), the memory allocated for them and the driver is 2.1G(not the specified 4G). So hadoop yarn says 17.63G are used, but the spark says 8.4G are allocated. So, what is happening here? Is this related to the Capacity Scheduler(from the documentation I couldn't come up with this conclusion)?

1

1 Answers

0
votes

Can you check whether spark.dynamicAllocation.enabled is turned on. If that is the case then spark your application may give resources back to the cluster if they are no longer used. The minimum number of executors to be launched at the startup will be decided by spark.executor.instances.

If that is not the case, what is your source for spark application and what is the partition size set for that, spark will literally map the partition size to the spark cores, if your source has only 10 partitions, and when you try to allocate 15 cores it will only use 10 cores because that is what is needed. I guess this might be the cause that spark has launched 3 executors instead of 4. Regarding memory i would recommend to revisit because you are asking for 4 executors and 1 driver with 4Gb each which would be 5*4+5*384MB approx equals to 22GB and you are trying to use up everything and not much is left for your OS and nodemanager to run that would not be the ideal way to do.