0
votes

I have a sqoop action which pulls data from postgres database and then imports into a hive table. When I execute the oozie workflow, scoop pulls the data from postgres into HDFS. But it fails to import data into the hive table. The logs are no way useful as I just get Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1] from the oozie web console UI. Can we actually do a hive import inside sqoop action? Or do I have to perform a Hive action separately after sqoop does an import into HDFS?

<action name="ads-sqoop-import">
    <sqoop xmlns="uri:oozie:sqoop-action:0.2">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <configuration>
            <property>
                <name>dbIP</name>
                <value>${dbIP}</value>
            </property>
            <property>
                <name>dbPort</name>
                <value>${dbPort}</value>
            </property>
            <property>
                <name>dbUserName</name>
                <value>${dbUserName}</value>
            </property>
            <property>
                <name>dbPassword</name>
                <value>${dbPassword}</value>
            </property>
            <property>
                <name>hive_db_name</name>
                <value>${hive_db_name}</value>
            </property>
            <property>
                <name>scoop_target_dir</name>
                <value>${scoop_target_dir}</value>
            </property>
            <property>
                <name>dbName</name>
                <value>${dbName}</value>
            </property>
        </configuration>
        <command>import --connect jdbc:postgresql://${dbIP}:${dbPort}/${dbName} --username ${dbUserName} --password &quot;${dbPassword}&quot; --table ads --hive-table ${hive_db_name}.ads --create-hive-table --hive-import -m 1 --target-dir ${scoop_target_dir}/ads
        </command>
    </sqoop>
    <ok to="orders-sqoop-import"/>
    <error to="kill"/>
</action>
1
you should be able to find the strout for the job if you find the oozie:launcher job for sqoop in your job tracker (or yarn). You will know it is the correct job because it will say it is successful, even though it failed (this is an oozie mechanism).Ryan Bedard
Hi Anand I'm facing similar issue but I have already added hive-site xml My yarn logs says sqoop.hive.HiveImport - Caused by: java.io.FileNotFoundException: File does not exist: hdfs:/user/yarn if you can suggest somethingchhaya vishwakarma
@chhayavishwakarma can you please post some logs?Anand

1 Answers

1
votes

I had to add the location of the hive-site.xml to the sqoop action to make the hive import work. Oozie needs the hive defaults like the metastore directory etc for it to import data into hive. Add the following code under the global section or with an action wherever you would want to perform hive functions. Copy the hive-site.xml to the HDFS and include it.

<job-xml>hdfs://namenode/hive-site.xml</job-xml>