I have a hadoop cluster of 15 nodes ( 1 master & 14 slaves) with HDFS having a replication factor of 3. I have ran TeraSort in YARN for 10GB with the following command:
yarn jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar terasort /terasort-input /terasort-output
After I have done it with 14 functional nodes I have started to decomission one node at a time and run TeraSort again to see how the time of execution changes. I have noticed that the time of execution does not actually change that much while I scale down and so I have similar execution times even at 7 slave nodes.
JobHistory recalls this values:
14 slaves: Elapsed: 32mins, 12sec ; Average Map Time: 4mins, 4sec; Average Shuffle Time: 14mins, 56sec; Average Merge Time: 3mins, 50sec; Average Reduce Time: 11mins, 35sec ;
11 slaves: Elapsed: 30mins, 6sec; Average Map Time*: 5mins, 2sec; Average Shuffle Time: 6mins, 9sec; Average Merge Time: 8mins, 52sec; Average Reduce Time: 11mins, 39sec;
8 slaves: Elapsed: 32mins, 15sec; Average Map Time: 4mins, 29sec; Average Shuffle Time: 13mins, 48sec; Average Merge Time: 4mins, 20sec; Average Reduce Time: 11mins, 11sec;
7 slaves: Elapsed: 30mins, 6sec; Average Map Time: 4mins, 28sec; Average Shuffle Time: 7mins, 26sec; Average Merge Time: 8mins, 26sec; Average Reduce Time: 11mins, 24sec;
Questions:
- Why do I almost have the same execution time for different number of worker nodes ?
- How can I take full advantage of the Hadoop cluster so that jobs run faster with 14 worker nodes than with 7 nodes?