4
votes

I am very new to Hadoop and was trying to run a simple program using this.

I have copied the local example data to hdfs, but during my map reduce job when I am running this command as per the official apache documentation

hadoop jar hadoop-streaming-2.7.3.jar \
-input /user/hduser/gutenberg/* \
-output /user/hduser/gutenberg-output \
-mapper /home/hduser/mapper.py \
-reducer /home/hduser/reducer.py

I am getting this error

Not a valid JAR: /usr/lib/hadoop-streaming-2.7.3.jar

Please try to help me.

2

2 Answers

5
votes

It is working with Hadoop 2.7.3

Here is the command you need to run

[Linux]$ hadoop jar \ 
/usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.7.3.jar \
-file /home/python/mapper.py /home/python/reducer.py \
-mapper "python mapper.py" \
-reducer "python reducer1.py" \
-input /tmp/word_i \
-output /tmp/word_output
1
votes

The hadoop-streaming-jar's location:

$HADOOP_HOME/share/hadoop/tools/lib/hadoop-streaming-2.7.1.2.4.2.0-258.jar

because the $HADOOP_HOME is not same sometimes.