1
votes

I'm write MapReduce job in Netbeans and generate (also in NB) a jar file. When I try to execute this job in hadoop (version 1.2.1) I execute this command:

$ hadoop jar job.jar org.job.mainClass /home/user/in.txt /home/user/outdir

This command not show any errors but not create outdir, outfiles, ...

This is my job code:

Mapper

public class Mapper extends MapReduceBase implements org.apache.hadoop.mapred.Mapper<LongWritable, Text, Text, IntWritable> {

            private final IntWritable one = new IntWritable(1);
            private Text company = new Text("");


            @Override
            public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
                company.set(value.toString());
                output.collect(value, one);

            }

        }

Reducer

public class Reducer extends MapReduceBase implements org.apache.hadoop.mapred.Reducer<Text, IntWritable, Text, IntWritable> {

    @Override
    public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {

        int sum = 0;
        while (values.hasNext()){
            sum++;
            values.next();
        }

        output.collect(key, new IntWritable(sum));
    }
}

Main

 public static void main(String[] args) {

    JobConf configuration = new JobConf(CdrMR.class);
    configuration.setJobName("Dedupe companies");
    configuration.setOutputKeyClass(Text.class);
    configuration.setOutputValueClass(IntWritable.class);
    configuration.setMapperClass(Mapper.class);
    configuration.setReducerClass(Reducer.class);
    configuration.setInputFormat(TextInputFormat.class);
    configuration.setOutputFormat(TextOutputFormat.class);
    FileInputFormat.setInputPaths(configuration, new Path(args[0]));
    FileOutputFormat.setOutputPath(configuration, new Path(args[1]));

}

The format of input file is as follows:

name1
name2
name3
...

Also say I'm executing hadoop in virtual machine (Ubuntu 12.04) without root privileges. Are Hadoop executing the job and stored outfile in different dir?

3
from which user you are running hadoop and where are you storing the output? are they both same users? - Y.Prithvi
Yes, both are the same. User and home dir of this user. - Juan Garcia
add this in last line of main method System.exit(configuration.waitForCompletion(true) ? 0 : 1); - Y.Prithvi
The object JobConf don't have waitForCompletion member. - Juan Garcia

3 Answers

0
votes

The correct hadoop command is

hadoop jar myjar packagename.DriverClass input output

CASE 1

MapReduceProject
    |
    |__ src
         |
         |__ package1
            - Driver
            - Mapper
            - Reducer

Then You can just use

hadoop jar myjar input output

CASE 2

MapReduceProject
    |
    |__ src
         |
         |__ package1
         |  - Driver1
         |  - Mapper1
         |  - Reducer1
         |
         |__ package2
            - Driver2
            - Mapper2
            - Reducer2

For this case you must specify driver class along with your hadoop command.

hadoop jar myjar packagename.DriverClass input output
2
votes

According to this article you need to submit your JobConf with this method:

JobClient.runJob(configuration);
0
votes

The correct hadoop command is

$ hadoop jar job.jar /home/user/in.txt /home/user/outdir

not

$ hadoop jar job.jar org.job.mainClass /home/user/in.txt /home/user/outdir

Hadoop think org.job.mainClass is input file and in.txt is outputfile. The result of execution is File Already Exist: in.txt. This code work fine for main method:

public static void main(String[] args) throws FileNotFoundException, IOException {

    JobConf configuration = new JobConf(CdrMR.class);
    configuration.setJobName("Dedupe companies");
    configuration.setOutputKeyClass(Text.class);
    configuration.setOutputValueClass(IntWritable.class);
    configuration.setMapperClass(NameMapper.class);
    configuration.setReducerClass(NameReducer.class);
    configuration.setInputFormat(TextInputFormat.class);
    configuration.setOutputFormat(TextOutputFormat.class);
    FileInputFormat.setInputPaths(configuration, new Path(args[0]));
    FileOutputFormat.setOutputPath(configuration, new Path(args[1]));
    System.out.println("Hello Hadoop");
    System.exit(JobClient.runJob(configuration).isSuccessful() ? 0 : 1);
}

Thanks @AlexeyShestakov and @Y.Prithvi