0
votes

I need to do a sqoop import all tables from an existing mysql database to hive, the first table is categories.

The command is as below:

sqoop import-all-tables -m 1 \
--connect=jdbc:mysql://ms.itversity.com/retail_db \
--username=retail_user \
--password=itversity \
--hive-import \
--hive-overwrite \
--create-hive-table \
--compress \
--compression-codec org.apache.hadoop.io.compress.SnappyCodec \
--outdir java_output0322

It failed for the following reason:

Output directory hdfs://nn01.itversity.com:8020/user/paslechoix/categories already exists

I am wondering how can I import them into /apps/hive/warehouse/paslechoix.db/

paslechoix is the hive database name.

UPDATE1 on 20180323 to Bala who commented at the first place:

I've updated the script to:

sqoop import-all-tables -m 1 \
--connect=jdbc:mysql://ms.itversity.com/retail_db \
--username=retail_user \
--password=itversity \
--hive-import \
--hive-overwrite \
--create-hive-table \
--hive-database paslechoix_new \
--compress \
--compression-codec org.apache.hadoop.io.compress.SnappyCodec \
--outdir java_output0323

added what you suggested: --hive-database paslechoix_new paslechoix_new is a new hive database just created.

I still receive error of:

AlreadyExistsException: Output directory hdfs://nn01.itversity.com:8020/user/paslechoix/categories already exists

Now, it is really interesting, why it keeps referring to paslechoix? I already indicate in the script that the hive database is paslechoix_new, why it doesn't get recognized?

Update 2 on 20180323:

I took the other suggestion in Bala's comment:

sqoop import-all-tables -m 1 \
--connect=jdbc:mysql://ms.itversity.com/retail_db \
--username=retail_user \
--password=itversity \
--hive-import \
--hive-overwrite \
--create-hive-table \
--hive-database paslechoix_new \
--warehouse-dir /apps/hive/warehouse/paslechoix_new.db \
--compress \
--compression-codec org.apache.hadoop.io.compress.SnappyCodec \
--outdir java_output0323

So now the import doesn't throw error any more, however, I checked the hive database, all the tables are created, with no data

1

1 Answers

0
votes

add the option --warehouse-dir to import into a specific directory

--warehouse-dir /apps/hive/warehouse/paslechoix.db/

if you want to import to specific hive database, then use

--hive-database paslechoix