You can set the number of mappers and reducers in the main function of your Java program that starts the MapReduce job using JobConf's conf.setNumMapTasks(int num)
and conf.setNumRedTasks(int num)
, respectively.
For the mapper, note the following from the api:
"This is only a hint to the framework. The actual number of spawned map tasks depends on the number of InputSplits generated by the job's InputFormat.getSplits(JobConf, int). A custom InputFormat is typically used to accurately control the number of map tasks for the job."
Explicitly setting the number of input chunks is a bit more difficult. The way the input is split up is determined by the InputFormat
you use and the corresponding InputSplits
that it uses. You will have to make your own custom InputFormat/InputSplits if you wish to manipulate the way the input is split.