After experimenting with 2 reducers, reading the HowManyMapsAndReduces from Hadoop Wiki, hadoop: number of reducers remains a constant 4, Hadoop: Number of mappers and reducers and Setting the number of map tasks and reduce tasks I am driven in the conclusion that:
If I have 1 map (I understand that the number gets actually decided by Hadoop) and 2 reducers (where I actually provided only 1 file with the reducer code, e.g. -reducer /bin/wc
), then what will happen from the following?
- Hadoop will distribute the data the mapper sends to both reducers (e.g. given 1000 lines of text, it will give ~500 to 1st reducer and ~500 to 2nd reducer)?
- Hadoop will give all the data the mapper sends to both reducers (e.g. given 1000 lines of text, it will give 1000 to 1st reducer and 1000 to 2nd reducer)?
I think the 1st option, but I could not find evidence while searching the net.