1
votes

I am running WordCount example on AWS server. I want to test my output and analyze them. I want to increase the no. of mappers and no. of reducers and also no. of chunks.

How can I achieve the same ?

Do I have to set the no. of mappers/reducers while creating a job ? or I have to add some code ? I am using java.

1

1 Answers

0
votes

You can set the number of mappers and reducers in the main function of your Java program that starts the MapReduce job using JobConf's conf.setNumMapTasks(int num) and conf.setNumRedTasks(int num), respectively.

For the mapper, note the following from the api:

"This is only a hint to the framework. The actual number of spawned map tasks depends on the number of InputSplits generated by the job's InputFormat.getSplits(JobConf, int). A custom InputFormat is typically used to accurately control the number of map tasks for the job."

Explicitly setting the number of input chunks is a bit more difficult. The way the input is split up is determined by the InputFormat you use and the corresponding InputSplits that it uses. You will have to make your own custom InputFormat/InputSplits if you wish to manipulate the way the input is split.