I have an iterative mapreduce job in which, when a chunk, let's say Chunk i, is read by a mapper some information regarding the records within this chunk is stored in an auxiliary file, called F_i. In the next iteration (job), a different mapper might read Chunk i. However, this mapper must update some information in auxiliary file Fi. Is there any mechanism to do this?
I believe if we can get a way to distinguish between different chunks we can get it solved. e.g. if each chunk has a unique name, then the mapper can simply read the auxiliary file for the chunk it has fed by.