0
votes

I m using sqoop 1.4.5-cdh5.2.1 and oracle .

I m importing a small set of records of 115k from oracle . Sqoop command works fine on setting --num-mappers to 5. But when i set it to more than 5 , I get an error of JAVA HEAP SPACE.

Can any one tell this ,that why its happening so.

LOG Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.math.BigInteger.(BigInteger.java:394) at java.math.BigDecimal.bigTenToThe(BigDecimal.java:3380) at java.math.BigDecimal.bigDigitLength(BigDecimal.java:3635) at java.math.BigDecimal.precision(BigDecimal.java:2189) at java.math.BigDecimal.compareMagnitude(BigDecimal.java:2585) at java.math.BigDecimal.compareTo(BigDecimal.java:2566) at org.apache.sqoop.mapreduce.db.BigDecimalSplitter.split(BigDecimalSplitter.java:138) at org.apache.sqoop.mapreduce.db.BigDecimalSplitter.split(BigDecimalSplitter.java:69) at org.apache.sqoop.mapreduce.db.DataDrivenDBInputFormat.getSplits(DataDrivenDBInputFormat.java:171) at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:498) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:515) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:399) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1295) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1292) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1292) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1313) at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:198) at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:171) at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:268) at org.apache.sqoop.manager.SqlManager.importQuery(SqlManager.java:721) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:499) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) 2015-06-25 13:48:59 STATUS: 1 2015-06-25 13:48:59 ERROR Error (1) Sqoop failed. 2015-06-25 13:48:59 ERROR Error (1) run_sqoop

2
Are you using distibuter hadoop cluster / Pseudo cluster / VM box ?vijay kumar
please paste the err log in your questionvijay kumar

2 Answers

0
votes

By default, each map and reduce task runs in its own JVM. Hence, each mapper will consume certain amount of physical memory. As you keep increasing number of mappers, memory requirement will also keep growing. If java process cannot allocate enough memory it throws java.lang.OutOfMemoryError

In your case, system(or VM, if you are running VM) might have memory enough for max 5 mappers only.

You can run top command while launching >5 mappers and monitor free memory.

0
votes

Try to add property on $HADOOP_HOME/conf/mapred-site.xml as below

<!--for Sqoop config-->
<property>
<name>mapreduce.map.memory.mb</name>
<value>1024</value>
</property>

<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx512m</value>
</property>

Tuning for your environment, maybe plus or minus the value. Remember modify on every nodes.

or modify virtual memory limit on yarn-site

<property>
        <name>yarn.nodemanager.vmem-pmem-ratio</name>
        <value>4.2</value>
</property>

its default 2.1G