0
votes

I am trying to create a sqoop job for incremental import in hive.

sqoop job --create user_rating_import --meta-connect \ jdbc:hsqldb:hsql://192.168.225.129:16000/sqoop \ -- import --connect jdbc:postgresql://192.168.148.1:5432/movielens --username anm \ --table ratings -m 8 --target-dir /user/hive/warehouse/user_rating --incremental lastmodified --fields-terminated-by ',' \ --check-column rated_at --append --as-parquetfile --hive-import \ --hive-table user_rating_fact

Generating this:

19/12/03 21:39:23 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 --incremental lastmodified option for hive imports is not supported. Please remove the parameter --incremental lastmodified.

1
The alternative way is to use query and parametrise it with your lastmodified (bookmark), which you can store somewhere or select from target table before load. Believe me or mot, but taking control on bookmarks is more flexible way, you can do reloads easily by changing bookmarks (start and end date, etc) passed to the query as parametersleftjoin

1 Answers

0
votes

Please try this: sqoop job --create user_rating_import --meta-connect \ jdbc:hsqldb:hsql://192.168.225.129:16000/sqoop \ -- import --connect jdbc:postgresql://192.168.148.1:5432/movielens --driver com.mysql.jdbc.Driver --username anm --table ratings -m 8 --hive-import --target-dir "/user/hive/warehouse/user_rating" --fields-terminated-by ',' --check-column rated_at --append --as-parquetfile--hive-table "user_rating_fact"