I was trying to run a variation of WordCount example, the variation being, the Mapper outputs Text as key and Text as value, and the reducer outputs Text as key and NullWritable as value.
Besides the map, reduce signatures, I put the main method like this:
//start a conf
Configuration conf = new Configuration();
conf.set("str",str);
//initialize a job based on the conf
Job job = new Job(conf, "wordcount");
job.setJarByClass(org.myorg.WordCount.class);
//the reduce output
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(NullWritable.class);
//the map output
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
//Map and Reduce
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
//take hdfs locations as input and output
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
//run the job
job.waitForCompletion(true);
To debug, I put map function as
map(LongWritable key, Text value, Context context){
.........
context.write("1000000","2");
}
and reduce code as
reduce(Text key, Iterable<Text> values, Context context){
.......
context.write("v",NullWritable.get());
}
However, all what I see at output is map output. The reducer compiles, but is not even called! I believe I may be missing something in the main() method, whose code is described, but what is left? I dont see what information is needed further for the Job configuration.
thanks,