I'm write MapReduce job in Netbeans and generate (also in NB) a jar file. When I try to execute this job in hadoop (version 1.2.1) I execute this command:
$ hadoop jar job.jar org.job.mainClass /home/user/in.txt /home/user/outdir
This command not show any errors but not create outdir, outfiles, ...
This is my job code:
Mapper
public class Mapper extends MapReduceBase implements org.apache.hadoop.mapred.Mapper<LongWritable, Text, Text, IntWritable> {
private final IntWritable one = new IntWritable(1);
private Text company = new Text("");
@Override
public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
company.set(value.toString());
output.collect(value, one);
}
}
Reducer
public class Reducer extends MapReduceBase implements org.apache.hadoop.mapred.Reducer<Text, IntWritable, Text, IntWritable> {
@Override
public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
int sum = 0;
while (values.hasNext()){
sum++;
values.next();
}
output.collect(key, new IntWritable(sum));
}
}
Main
public static void main(String[] args) {
JobConf configuration = new JobConf(CdrMR.class);
configuration.setJobName("Dedupe companies");
configuration.setOutputKeyClass(Text.class);
configuration.setOutputValueClass(IntWritable.class);
configuration.setMapperClass(Mapper.class);
configuration.setReducerClass(Reducer.class);
configuration.setInputFormat(TextInputFormat.class);
configuration.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(configuration, new Path(args[0]));
FileOutputFormat.setOutputPath(configuration, new Path(args[1]));
}
The format of input file is as follows:
name1
name2
name3
...
Also say I'm executing hadoop in virtual machine (Ubuntu 12.04) without root privileges. Are Hadoop executing the job and stored outfile in different dir?
System.exit(configuration.waitForCompletion(true) ? 0 : 1);
– Y.Prithvi