I am trying to implement a MapReduce job, where each of the mappers would take 150 lines of the text file, and all the mappers would run simmultaniously; also, it should not fail, no matter how many map tasks fail.
Here's the configuration part:
JobConf conf = new JobConf(Main.class);
conf.setJobName("My mapreduce");
conf.set("mapreduce.input.lineinputformat.linespermap", "150");
FileInputFormat.addInputPath(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
The problem is that hadoop creates a mapper for every single line of text, they seem to run sequentially, and if a single one fails, the job fails.
From this I deduce, that the settings I've applied do not have any effect.
What did I do wrong?