Sqoop to dynamically create hive partitioned table from oracle and import data

Question

I have a table in oracle (table name is TRCUS) with customer's details, partitioned based on year & month. Partitions name in Oracle: PERIOD_JAN_13, PERIOD_FEB_13, PERIOD_JAN_14, PERIOD_FEB_14 etc

Now I want to import this table's data into HIVE using SQOOP directly.

Sqoop job should create a hive table, dynamically create partitions based on the oracle table partition and then import data into hive; into the respective partitions.

How can this be achievable using SQOOP ?

Unknown Unknown · Accepted Answer · 2017-12-25T06:06:13

Unfortunately, it cannot be achieved using Sqoop. However, there is one method which I guess you might not know.

Create a table in Hive without any partitions.
Set dynamic partition modes

SET hive.exec.dynamic.partition = true; SET hive.exec.dynamic.partition.mode = nonstrict;
Import data into Hive table that is not partitioned using Sqoop

sqoop import --connect "jdbc:mysql://quickstart.cloudera:3306/database1" --username root --password cloudera --query 'SELECT DISTINCT id, count from test WHERE $CONDITIONS' --target-dir /user/hive/warehouse/ --hive-table pd_withoutpartition --hive-database database1 --hive-import --hive-overwrite -m 1 --direct
Create another table with partitions
Overwrite into partition table from previous table

INSERT OVERWRITE TABLE pd_partition partition(name) SELECT id, count, name from pd_withoutpartition;

Note: Make sure that column with which you want to partition is mentioned last during overwrite in select statement.

Hive Version : Hive 1.1.0-cdh5.13.1

Sqoop to dynamically create hive partitioned table from oracle and import data

1 Answers