1
votes

I'm write MapReduce job in Netbeans and generate (also in NB) a jar file. When I try to execute this job in hadoop (version 1.2.1) I execute this command:

$ hadoop jar job.jar org.job.mainClass /home/user/in.txt /home/user/outdir

This command not show any errors but not create outdir, outfiles, ...

This is my job code:

Mapper

public class Mapper extends MapReduceBase implements org.apache.hadoop.mapred.Mapper<LongWritable, Text, Text, IntWritable> {

            private final IntWritable one = new IntWritable(1);
            private Text company = new Text("");


            @Override
            public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
                company.set(value.toString());
                output.collect(value, one);

            }

        }

Reducer

public class Reducer extends MapReduceBase implements org.apache.hadoop.mapred.Reducer<Text, IntWritable, Text, IntWritable> {

    @Override
    public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {

        int sum = 0;
        while (values.hasNext()){
            sum++;
            values.next();
        }

        output.collect(key, new IntWritable(sum));
    }
}

Main

 public static void main(String[] args) {

    JobConf configuration = new JobConf(CdrMR.class);
    configuration.setJobName("Dedupe companies");
    configuration.setOutputKeyClass(Text.class);
    configuration.setOutputValueClass(IntWritable.class);
    configuration.setMapperClass(Mapper.class);
    configuration.setReducerClass(Reducer.class);
    configuration.setInputFormat(TextInputFormat.class);
    configuration.setOutputFormat(TextOutputFormat.class);
    FileInputFormat.setInputPaths(configuration, new Path(args[0]));
    FileOutputFormat.setOutputPath(configuration, new Path(args[1]));

}

The format of input file is as follows:

name1
name2
name3
...

Also say I'm executing hadoop in virtual machine (Ubuntu 12.04) without root privileges. Are Hadoop executing the job and stored outfile in different dir?

3
from which user you are running hadoop and where are you storing the output? are they both same users?Y.Prithvi
Yes, both are the same. User and home dir of this user.Juan Garcia
add this in last line of main method System.exit(configuration.waitForCompletion(true) ? 0 : 1);Y.Prithvi
The object JobConf don't have waitForCompletion member.Juan Garcia

3 Answers

0
votes

The correct hadoop command is

hadoop jar myjar packagename.DriverClass input output

CASE 1

MapReduceProject
    |
    |__ src
         |
         |__ package1
            - Driver
            - Mapper
            - Reducer

Then You can just use

hadoop jar myjar input output

CASE 2

MapReduceProject
    |
    |__ src
         |
         |__ package1
         |  - Driver1
         |  - Mapper1
         |  - Reducer1
         |
         |__ package2
            - Driver2
            - Mapper2
            - Reducer2

For this case you must specify driver class along with your hadoop command.

hadoop jar myjar packagename.DriverClass input output
2
votes

According to this article you need to submit your JobConf with this method:

JobClient.runJob(configuration);
0
votes

The correct hadoop command is

$ hadoop jar job.jar /home/user/in.txt /home/user/outdir

not

$ hadoop jar job.jar org.job.mainClass /home/user/in.txt /home/user/outdir

Hadoop think org.job.mainClass is input file and in.txt is outputfile. The result of execution is File Already Exist: in.txt. This code work fine for main method:

public static void main(String[] args) throws FileNotFoundException, IOException {

    JobConf configuration = new JobConf(CdrMR.class);
    configuration.setJobName("Dedupe companies");
    configuration.setOutputKeyClass(Text.class);
    configuration.setOutputValueClass(IntWritable.class);
    configuration.setMapperClass(NameMapper.class);
    configuration.setReducerClass(NameReducer.class);
    configuration.setInputFormat(TextInputFormat.class);
    configuration.setOutputFormat(TextOutputFormat.class);
    FileInputFormat.setInputPaths(configuration, new Path(args[0]));
    FileOutputFormat.setOutputPath(configuration, new Path(args[1]));
    System.out.println("Hello Hadoop");
    System.exit(JobClient.runJob(configuration).isSuccessful() ? 0 : 1);
}

Thanks @AlexeyShestakov and @Y.Prithvi