I have a MR streaming job. My code is in C++. Its a mapper only job, with no reducer. Input to the the job is a directory containing three files. Job creates 3 mappers. Each mapper processes one input file and produces one output file in different format.
Input files are like:
MyDir/file1
MyDir/file2
MyDir/file3
Output file are like:
MyDir/Output/part-00000
MyDir/Output/part-00001
MyDir/Output/part-00002
I want to correlate input files to output files. For example, input file MyDir/file1
may correspond to output file MyDir/Output/part-00002
, i.e. mapper that processed input file MyDir/file1
may have produced output file MyDir/Output/part-00002
.
I want to know this relationship, i.e., which input file corresponds to which output file. Is there a simple way to know this?