I have a very large oracle table which is a partitioned table, I would ask whether or how Sqoop supports to do split based on oracle partitions, eg, one mapper to do import from one oracle partition.
0
votes
1 Answers
2
votes
Sqoop supports import from oracle partitioned table. Here is the documentation.
Syntax is somthing like this
sqoop import \
-Doraoop.disabled=false \
-Doraoop.import.partitions='"PARTITION-NAME","PARTITION-NAME1","PARTITION-NAME2",' \
--connect jdbc:oracle:thin:@XXX.XXX.XXX.XXX:15XX:SCHEMA_NAME \
--username user \
--password password \
--table SCHEMA.TABLE_NAME \
--target-dir /HDFS/PATH/ \
-m 1
Single mapper will be assigned to each partition that will write data to HDFS simultaneously.
Make sure you have dynamic partitions property enabled and number of partitions property value is also higher than the partitions existing in oracle when you create Hive table.