0
votes

I am processing large text files. Each record is a line from the input text files and I am searching for certain keywords in these records.
I want to know which of the following two methods will be more efficient(time complexity) while working in Hadoop MapReduce:

  1. Searching in map function pf the Mapper
  2. Searching in the reduce function of the Reducer

Please Help!

1
You don't even need a reducer for this. - Mike Park
suppose I also want to count the number of times a keyword is found - Aayush Rathore

1 Answers

3
votes

Both should be fine, however, based on your inputs I will try it in the map function because:

Only if you find the keyword, the data is emitted to the group and reduce phase. If the data matching your key words are less, then the overhead of group and reduction reduces significantly.