Hadoop MapReduce read and write Sequence File

Question

I'm trying to write MapReduce job which it can Read two sequence file in the Mapper. I've tried read and write a sequence file in 'main' but I don't know how to do it in Mapper. I think that I'm not really familiar with how MapReduce work. Thanks for helping me out.

Andrea Iacono Andrea Iacono · Accepted Answer · 2015-07-13T09:40:42

If everything is correct, in the main() method you wrote something like that:

 FileInputFormat.addInputPath()
 FileOutputFormat.setOutputPath()

to tell Hadoop the directory where to find your two input files and where to write the results of the computation.

When the job starts, Hadoop starts reading the files it finds in the input directory and it calls the map() method of the mapper passing to it every line of the file (one at the time) as an argument. At the end of the computation, when the reducer emits its data, Hadoop is going to write the results in one (or more) files in the specified output directory.

So, the mapper and the reducer don't need to know anything about input/output files.

Hadoop MapReduce read and write Sequence File

1 Answers