2
votes

I'm setting up a cluster with Hortnworks (HDP 2.4). I have a 4 nodes cluster each having (16Gb-RAM, 8-CPUs). I also have Spark installed with Zeppelin Notebook in order to use python (pyspark).

My problem is: I started with a configuration of 3 nodes and later I added another new node (so totally 4 as said before), anyway the number of executors on Spark remains "3".

I see on the web that the number of executors is settable in SPARK_EXECUTOR_INSTANCES, but this param is present only in spark-env template of the config page of Spark in Ambari UI. Seems it demand to YARN the decision about executors, but in YARN I haven't found anything about this.

enter image description here

Definitively, How I can increase the number of executor in my Hortonworks Hadoop Cluster using Ambari?

2
Are you using spark with YARN cluster managerSachin Janani
I think yes...how can I check this configuration?Pietro Fragnito
You can do it in 2 ways set "spark.dynamicAllocation.enabled" to true or set number of executor instance "spark.executor.instances" to some number that you wantSachin Janani
Where can I find these options? As said I see them only in "spark-env template" and this template says that these options are read in YARN Client mode.Pietro Fragnito

2 Answers

9
votes

Pietro, you can change that on Zeppelin itself.

On the top right corner, open the menu, and enter the "Interpreter" configuration.

There is a section, called "interpreters". The last subsection is called "spark", and you should find this setting there.

If it is not, just insert it, editing the subsection.

Hope that helps.

4
votes

From top right corner, click down arrow -> click Interpreter -> find spark2 interpreter -> edit it -> add following two:

  • spark.shuffle.service.enabled -> true
  • spark.dynamicAllocation.enabled -> true