Hive table outdated after Sqoop incremental import

Question

I'm trying to do a Sqoop incremental import to a Hive table using "--incremental append".

I did an initial sqoop import and then create a job for the incremental imports. Both are executed successfully and new files have been added to the same original Hive table directory in HDFS, but when I check my Hive table, the imported observations are not there. The Hive table is equal before the sqoop incremental import.

How can I solve that? I have about 45 Hive tables and would like to update them daily automatically after the Sqoop incremental import.

First Sqoop Import:

sqoop import \
--connect jdbc:db2://... \
--username root \
-password 9999999 \
--class-name db2fcs_cust_atu \
--query "SELECT * FROM db2fcs.cust_atu WHERE \$CONDITIONS" \
--split-by PTC_NR  \
--fetch-size 10000 \
--delete-target-dir \
--target-dir /apps/hive/warehouse/fcs.db/db2fcs_cust_atu \
--hive-import \
--hive-table fcs.cust_atu \
-m 64;

Then I run Sqoop incremental import:

sqoop job \
-create cli_atu \
--import \
--connect jdbc:db2://... \
--username root \
--password 9999999 \
--table db2fcs.cust_atu \
--target-dir /apps/hive/warehouse/fcs.db/db2fcs_cust_atu \
--hive-table fcs.cust_atu \
--split-by PTC_NR \
--incremental append \
--check-column TS_CUST \
--last-value '2018-09-09'

Laenka-Oss Laenka-Oss · Accepted Answer · 2018-11-13T02:08:29

It might be difficult to understand/answer your question without looking at your full query because your outcome also depends on your choice of arguments and directories. Mind to share your query?

Hive table outdated after Sqoop incremental import

1 Answers