In Hadoop MapReduce the intermediate output (map output) is saved in the local disk. I would like to know if it is possible to start a job just with the reduce phase, that reads the mapoutput from the local disk, partition the data and execute the reduce tasks?
3 Answers
4
votes
There is a basic implementation of Mapper called IdentityMapper , which essentially passes all the key-value pairs to a Reducer.
- Reducer reads the outputs generated by the different mappers as pairs and emits key value pairs.
- The Reducer’s job is to process the data that comes from the mapper.
- If MapReduce programmer do not set the Mapper Class using JobConf.setMapperClass then IdentityMapper.class is used as a default value.
You can't run just reducers without any mappers..
0
votes