1
votes

I would like to import data from Oracle to Hive by using Sqoop as Parquet file. I have been trying to import data using sqoop using the following command:

sqoop import --as-parquetfile --connect jdbc:oracle:thin:@10.222.14.11:1521/eservice --username MOJETL  --password-file file:///home/$(whoami)/MOJ_Analytic/moj_analytic/conf/.djoppassword --query 'SELECT * FROM CMST_OFFENSE_RECORD_FAMILY WHERE $CONDITIONS' --fields-terminated-by ',' --escaped-by ',' --hive-overwrite --hive-import --hive-database default --hive-table tmp3_cmst_offense_record_family --hive-partition-key load_dt --hive-partition-value '20200213' --split-by cmst_offense_record_family_ref --target-dir hdfs://nameservice1:8020/landing/tmp3_cmst_offense_record_family/load_dt=20200213

I get the following error:

ERROR sqoop.Sqoop: Got exception running Sqoop: org.kitesdk.data.ValidationException: Dataset name default.tmp3_cmst_offense_record_family is not alphanumeric (plus '_')
org.kitesdk.data.ValidationException: Dataset name default.tmp3_cmst_offense_record_family is not alphanumeric (plus '_')

I've tried to remove

sqoop import --as-parquetfile --connect jdbc:oracle:thin:@10.222.14.11:1521/eservice --username MOJETL  --password-file file:///home/$(whoami)/MOJ_Analytic/moj_analytic/conf/.djoppassword --query 'SELECT * FROM CMST_OFFENSE_RECORD_FAMILY WHERE $CONDITIONS' --fields-terminated-by ',' --escaped-by ',' --split-by cmst_offense_record_family_ref --target-dir hdfs://nameservice1:8020/landing/tmp3_cmst_offense_record_family/load_dt=20200213

I still got the same error.

ERROR sqoop.Sqoop: Got exception running Sqoop: org.kitesdk.data.ValidationException: Dataset name load_dt=20200213 is not alphanumeric (plus '_')
org.kitesdk.data.ValidationException: Dataset name load_dt=20200213 is not alphanumeric (plus '_')
1

1 Answers

0
votes

Please try rewriting this part:

--hive-table default.tmp3_cmst_offense_record_family

with this one:

--hive-table tmp3_cmst_offense_record_family

You already specified the database name with the clause --hive-database