Because of nature of Map/Reduce applications, reduce function may be called more than once, so the Input/Output key value must be same like Map/Reduce implementation of MongoDB. I Wonder why in Hadoop implementation it is different?(I'd better say it is allowed to be different)
org.apache.hadoop.mapreduce.Reducer<KEYIN,VALUEIN,KEYOUT,VALUEOUT>
Second question: How hadoop knows that the output of reduce function should be returned to reduce again in next run or write it to HDFS? for example:
public class MyReducer extends Reducer<Text, IntWritable, Text, IntWritable>
public void reduce(Text key, Iterable<IntWritable> values, Context context) {
context.write(key, value) /* this key/value will be returned to reduce in next run or will be written to HDFS? */
}
}