1
votes

I have a table in oracle with only 4 columns...

Memberid --- bigint

uuid --- String

insertdate --- date

updatedate --- date

I want to import those data in HIVE table using sqoop. I create corresponding HIVE table with

create  EXTERNAL TABLE memberimport(memberid BIGINT,uuid varchar(36),insertdate timestamp,updatedate timestamp)LOCATION '/user/import/memberimport';

and sqoop command

sqoop import --connect jdbc:oracle:thin:@dbURL:1521/dbName --username ** --password *** --hive-import --table MEMBER --columns 'MEMBERID,UUID,INSERTDATE,UPDATEDATE' --map-column-hive MEMBERID=BIGINT,UUID=STRING,INSERTDATE=TIMESTAMP,UPDATEDATE=TIMESTAMP --hive-table memberimport -m 1   

Its working properly and able to import data in HIVE table.

Now I want to update this table with incremental update with updatedate (last value today's date) so that I can get day to day update for that OLTP table into my HIVE table using sqoop.

For Incremental import I am using following sqoop command

sqoop import --hive-import --connect jdbc:oracle:thin:@dbURL:1521/dbName --username *** --password *** --table MEMBER --check-column UPDATEDATE --incremental append --columns 'MEMBERID,UUID,INSERTDATE,UPDATEDATE' --map-column-hive MEMBERID=BIGINT,UUID=STRING,INSERTDATE=TIMESTAMP,UPDATEDATE=TIMESTAMP --hive-table memberimport -m 1

But I am getting exception

"Append mode for hive imports is not yet supported. Please remove the parameter --append-mode"

When I remove the --hive-import it run properly but I did not found those new update in HIVE table that I have in OLTP table.

Am I doing anything wrong ? Please suggest me how can I run incremental update with Oracle - Hive using sqoop.

Any help will be appropriated..

Thanks in Advance ...

2

2 Answers

0
votes

Although i don't have resources to replicate your scenario exactly. You might want to try building a sqoop job and test your use case.

sqoop job --create sqoop_job \
  -- import \
  --connect "jdbc:oracle://server:port/dbname" \
  --username=(XXXX) \
  --password=(YYYY) \
  --table (TableName)\
  --target-dir (Hive Directory corresponding to the table) \
  --append \
  --fields-terminated-by '(character)' \
  --lines-terminated-by '\n' \
  --check-column "(Column To Monitor Change)" \
  --incremental append \
  --last-value (last value of column being monitored) \
  --outdir (log directory)

when you create a sqoop job, it takes care of --last-value for subsequent runs. Also here i have used the Hive table's data file as target for incremental update.

Hope this provides a helpful direction to proceed.

0
votes

There is no direct way to achieve this in Sqoop. However you can use 4 Step Strategy.