0
votes

I am trying to import Oracle tables into hive directly with sqoop.

Oracle tables use data types NUMBER, VARCHAR2, RAW

When I tried:

sqoop import ... --hive-import --hive-overwrite --hive-database default --fields-terminated-by '|' --hive-drop-import-delims --null-string '\\N' --null-non-string '\\N' --warehouse-dir "/test"

All data types in hive tables are either double or string but I want int, date, etc for NUMBER(1), Date types.

I have tried like adding few tags like

--map-column-hive O_abc=INT,O_def=DATE,pqr=INT,O_uvw=INT,O_xyz=INT.

Is there any way I can automatic because I need to import 150 to 200 tables. It's tedious to mention all map-columns for every table.

Environment:

  • Hadoop-2.6.0
  • Sqoop-1.4.6
  • Hive-2.3.0
  • Java-1.8
  • two node cluster

Thanks in advance!

1
first bring the RDBMS table contents to HADOOP cluster using SQOOP, later as per your choice using hive you can create an external or internal table over it as per your requirement, also to schedule import from multiple RDBMS tables to cluster, use OOZIE workflow, also you can get an example over import and creating table from here: rittmanmead.com/blog/2014/03/…ArifMustafa

1 Answers

1
votes

You could import all tables from Oracle to HDFS (sqoop import-all-tables {generic-args} {import-args}) and create an external and internal table based on your requirement.