1
votes

I have a table in oracle (table name is TRCUS) with customer's details, partitioned based on year & month. Partitions name in Oracle: PERIOD_JAN_13, PERIOD_FEB_13, PERIOD_JAN_14, PERIOD_FEB_14 etc

Now I want to import this table's data into HIVE using SQOOP directly.

Sqoop job should create a hive table, dynamically create partitions based on the oracle table partition and then import data into hive; into the respective partitions.

How can this be achievable using SQOOP ?

1

1 Answers

3
votes

Unfortunately, it cannot be achieved using Sqoop. However, there is one method which I guess you might not know.

  1. Create a table in Hive without any partitions.
  2. Set dynamic partition modes

    SET hive.exec.dynamic.partition = true; SET hive.exec.dynamic.partition.mode = nonstrict;

  3. Import data into Hive table that is not partitioned using Sqoop

    sqoop import --connect "jdbc:mysql://quickstart.cloudera:3306/database1" --username root --password cloudera --query 'SELECT DISTINCT id, count from test WHERE $CONDITIONS' --target-dir /user/hive/warehouse/ --hive-table pd_withoutpartition --hive-database database1 --hive-import --hive-overwrite -m 1 --direct

  4. Create another table with partitions

  5. Overwrite into partition table from previous table

    INSERT OVERWRITE TABLE pd_partition partition(name) SELECT id, count, name from pd_withoutpartition;

Note: Make sure that column with which you want to partition is mentioned last during overwrite in select statement.

Hive Version : Hive 1.1.0-cdh5.13.1