2
votes

MapReduce basic information for passing and emiting key value pairs. I need little bit clarity what we pass and what emits. Here my concerns: MapReduce Input and OutPut:

1.Map() method-Does it takes single or list of key-value pair and emits what? 2.For each input key-value pair,what mappers emit ? Same type or different type ? 3.For each intermediate key ,what the reducer will emit ? Is there any restriction of type ? 4.Reducer receives all values assocaited with same key.How the values will be ordered like sorted or orbitarly ordered ? Does that order vary from run to run ? 5.During shuffle and sort phase,In which order keys and values are presented ?

2

2 Answers

5
votes
  • For each input k1, v1 map emits zero or more k2, v2.
  • For each k2 reducer receives k2, list(v1,v3,v4..).
  • For each input k2, list(v) reducer can emit zero or more k3, v3.

Values are arbitrarily ordered in step 2. Key, value - output of mapper and reducer should be of same type i.e. all key must be same type and all value must be same type.

0
votes

Map method: receive as input (K1,V1) and return (K2,V2). That is, the the output key and value can be different from the input key and value.

Reducer method: after the output of the mappers has been shuffled correctly (same key goes to the same reducer), the reducer input is (K2, LIST(V2)) and its output is (K3,V3). As a result of the shuffling process, the keys arrives the reducer sorted by the key K2.

If you want to order the keys in your particular manner, you can implement the compareTo method of the key K3.

Referring your questions:

1. Answered above.
2. You can emit whatever you want as long it consists of a key and a value. 
   For example, in the WordCount you send as key the word and as value 1.
3. In the WordCount example, the reducer will receive a word and list of number. 
   Then, it will sum up the numbers and emit the word and its sum.
4. Answered above.
5. Answered above.