0
votes

I'm teaching myself Hadoop and the Map Reduce programming model. I'm trying to understand it's core elements: I'm trying to match each of the elements below to the features below:

  • Reducer
  • Combiner
  • Shuffle and Sort
  • Mapper
  • Partitioner
  • Replication

The features I have to map the above Hadoop components are:

-Grouping, move computation to data, help with copy-phase bottleneck, load, balancing, data filtering, global computation

My understanding: Reducer--->global computation, Combiner---> grouping, shuffle and sort(process of moving data from the mappers to the reducers)-->move computation to data. Mapper-->(data filtering) Partitioner-->load balancing, and lastly replication-->helps with copy-phase bottleneck.

I would really appreciate it if somebody could check my understanding of the basic hadoop components and correct me where necessary.

1
Replication isn't part of MapReduce, only the HDFS protocol - OneCricketeer

1 Answers

0
votes

Replication → Move computation to data, Combiner → Helps with copy phase bottleneck, Mapper → Data filtering, Reducer → Global computation, Partitioner → Load balancing, Shuffle and Sort → Grouping