If I get the same key/value pairs from 2 different mappers running on 2 different datanodes, and if I am using a single reducer, how can I eliminate the duplicate key/value pair and prevent it from entering the reducer?
Should I use a combiner and then check if there are duplicate values for the same key and then eliminate it in the combiner? But the combiner takes as input all key value pairs from single mapper, right?