I'm teaching myself Hadoop and the Map Reduce programming model. I'm trying to understand it's core elements: I'm trying to match each of the elements below to the features below:
- Reducer
- Combiner
- Shuffle and Sort
- Mapper
- Partitioner
- Replication
The features I have to map the above Hadoop components are:
-Grouping, move computation to data, help with copy-phase bottleneck, load, balancing, data filtering, global computation
My understanding: Reducer--->global computation, Combiner---> grouping, shuffle and sort(process of moving data from the mappers to the reducers)-->move computation to data. Mapper-->(data filtering) Partitioner-->load balancing, and lastly replication-->helps with copy-phase bottleneck.
I would really appreciate it if somebody could check my understanding of the basic hadoop components and correct me where necessary.