0
votes

I'm not able to execute a sample job from oozie using sqoop command to import data into hive. I've placed the hive-site.xml in hdfs path but I think it's not picking the hive-site.xml file. I'm getting class not found exception. How to fix this?

workflow.xml

<!-- This is a comment -->
<workflow-app xmlns="uri:oozie:workflow:0.4" name="oozie-wf">
   <start to = "sqoop-node1" />
   <!--Step 1 -->
   <action name = "sqoop-node1" >
   <sqoop xmlns="uri:oozie:sqoop-action:0.2">
     <job-tracker></job-tracker>
     <name-node></name-node>
     <command> import command </command>
   </sqoop>
   <ok to="end"/>
   <error to="kill_job"/>
   </action>
   <kill name = "kill_job">
   <message>Job failed</message>
   </kill>
   <end name = "end" />
</workflow-app>
nameNode=ip jobTracker=ip queueName=default user.name=oozie oozie.use.system.libpath=true oozie.libpath=/user/hdfs/share/share/lib/sqoop oozie.wf.application.path=workflow path outputDir=/tmp/oozie.txt 

java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf

1
Post code in your question, not in the comments. Click the edit button to do so.Takarii

1 Answers

0
votes

I guess your Sqoop action requires the HCatalog library to interact with the Hive Metastore. And Oozie does not add that library by default, you have to require it explicitly.

Note that there is some literature about using HCatalog from Pig, but very little from Sqoop. Anyway the trick is the same...
From Oozie documentation:

Oozie share libraries are organized per action type...
Oozie provides a mechanism to override the action share library JARs ...
More than one share library directory name can be specified for an action ... For example: When using HCatLoader and HCatStorer in pig, oozie.action.sharelib.for.pig can be set to pig,hcatalog to include both pig and hcatalog jars.

In your case, you need to override a specific <property> in your Sqoop action, named oozie.action.sharelib.for.sqoop, with value sqoop,hcatalog -- then Oozie will provide the required JARs at run-time.