I'm running a map reduce job with number of reducers set to default (one reducer). In theory, the output must be one file per reducer, but when I run my job I have two files
part-r-00000
and
part-r-00001
Why is this happening ?
There's only one node in my cluster.
My Driver class :
public class DriverDate extends Configured implements Tool { @Override public int run(String[] args) throws Exception { if (args.length != 2) { System.out.printf("Usage: AvgWordLength inputDir outputDir\n"); System.exit(-1); } Job job = new Job(getConf()); job.setJobName("Job transformacio dates"); job.setJarByClass(DriverDate.class); job.setMapperClass(MapDate.class); job.setReducerClass(ReduceDate.class); job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(NullWritable.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(NullWritable.class); FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.waitForCompletion(true); return 0; } public static void main(String[] args) throws Exception{ Configuration conf = new Configuration(); ToolRunner.run(conf,new DriverDate(), args); } }