1
votes

I'm trying to do a Sqoop incremental import to a Hive table using "--incremental append".

I did an initial sqoop import and then create a job for the incremental imports. Both are executed successfully and new files have been added to the same original Hive table directory in HDFS, but when I check my Hive table, the imported observations are not there. The Hive table is equal before the sqoop incremental import.

How can I solve that? I have about 45 Hive tables and would like to update them daily automatically after the Sqoop incremental import.

First Sqoop Import:

sqoop import \
--connect jdbc:db2://... \
--username root \
-password 9999999 \
--class-name db2fcs_cust_atu \
--query "SELECT * FROM db2fcs.cust_atu WHERE \$CONDITIONS" \
--split-by PTC_NR  \
--fetch-size 10000 \
--delete-target-dir \
--target-dir /apps/hive/warehouse/fcs.db/db2fcs_cust_atu \
--hive-import \
--hive-table fcs.cust_atu \
-m 64;

Then I run Sqoop incremental import:

sqoop job \
-create cli_atu \
--import \
--connect jdbc:db2://... \
--username root \
--password 9999999 \
--table db2fcs.cust_atu \
--target-dir /apps/hive/warehouse/fcs.db/db2fcs_cust_atu \
--hive-table fcs.cust_atu \
--split-by PTC_NR \
--incremental append \
--check-column TS_CUST \
--last-value '2018-09-09'
1

1 Answers

0
votes

It might be difficult to understand/answer your question without looking at your full query because your outcome also depends on your choice of arguments and directories. Mind to share your query?