17
votes

In Ubuntu, when I am running the hadoop example :

$bin/hadoop jar hadoop-examples-1.0.4.jar grep input output 'dfs[a-z.]+' 

$echo $HADOOP_HEAPSIZE
2000

In log, I am getting the error as :

INFO mapred.JobClient: Task Id : attempt_201303251213_0012_m_000000_2, Status : FAILED Error: Java heap space 13/03/25 15:03:43 INFO mapred.JobClient: Task Id :attempt_201303251213_0012_m_000001_2, Status : FAILED Error: Java heap space13/03/25 15:04:28 INFO mapred.JobClient: Job Failed: # of failed Map Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: task_201303251213_0012_m_000000 java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1265) at org.apache.hadoop.examples.Grep.run(Grep.java:69) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.examples.Grep.main(Grep.java:93)

Let us know what is the problem.

4

4 Answers

45
votes

Clearly you have run out of the heap size allotted to Java. So you shall try to increase that.

For that you may execute the following before executing hadoop command:

export HADOOP_OPTS="-Xmx4096m"

Alternatively, you can achieve the same thing by adding the following permanent setting in your mapred-site.xml file, this file lies in HADOOP_HOME/conf/ :

<property>
    <name>mapred.child.java.opts</name>
    <value>-Xmx4096m</value>
</property>

This would set your java heap space to 4096 MB (4GB), you may even try it with a lower value first if that works. If that too doesn't work out then increase it more if your machine supports it, if not then move to a machine having more memory and try there. As heap space simply means you don't have enough RAM available for Java.

UPDATE: For Hadoop 2+, make the changes in mapreduce.map.java.opts instead.

7
votes
<property>
   <name>mapred.child.java.opts</name>
  <value>-Xmx4096m</value>
</property>

Works for me.

export HADOOP_OPTS="-Xmx4096m"

doesn't work

3
votes

Using Hadoop 2.5.0-cdh5.2.0, this worked for me to change the heap size of the local (sequential) java process:

export HADOOP_HEAPSIZE=2900
hadoop jar analytics.jar .....

The reason it worked is that /usr/lib/hadoop/libexec/hadoop-config.sh has

# check envvars which might override default args
if [ "$HADOOP_HEAPSIZE" != "" ]; then
  #echo "run with heapsize $HADOOP_HEAPSIZE"
  JAVA_HEAP_MAX="-Xmx""$HADOOP_HEAPSIZE""m"
  #echo $JAVA_HEAP_MAX
fi
0
votes

If you add property on mapred-site.xml

<property>
   <name>mapred.child.java.opts</name>
  <value>-Xmx2048m</value>
</property>

Sometimes it happens another because it more than virtual memory limit In this situation, you must add

<property>
        <name>yarn.nodemanager.vmem-pmem-ratio</name>
        <value>4.2</value>
</property>

on yarn-site.xml

because its default 2.1G sometimes too small.