I want to import data from an Oracle database to Hive using Sqoop. I want Sqoop to create the table into the destination Hive database.
I put the Oracle JDBC (ojdbc6.jar) into the Sqoop lib directory.
I try those 2 methods but none of them work.
sqoop import \
--connect jdbc:oracle:thin:@${DB_HOST}:${DB_PORT}:${DB_NAME} \
--username ${DB_USER} \
--password ${DB_PWD} \
--table ${INPUT_TABLE} \
--hcatalog-home /usr/hdp/current/hive-webhcat \
--hcatalog-database ${OUTPUT_DB} \
--hcatalog-table ${OUTPUT_TABLE} \
--create-hcatalog-table \
--num-mappers 1
sqoop import \
--connect jdbc:oracle:thin:@${DB_HOST}:${DB_PORT}:${DB_NAME} \
--username ${DB_USER} \
--password ${DB_PWD} \
--hive-home /usr/hdp/current/hive \
--hive-import \
--create-hive-table \
--hive-table "${OUTPUT_DB}.${OUTPUT_TABLE}" \
--table ${INPUT_TABLE}
I have this error message:
ERROR tool.ImportTool: Imported Failed: There is no column found in the target table input_table. Please ensure that your table name is correct.
It seems Sqoop doesn't take into account the --create-hcatalog-table or --create-hive-table
However, when I import data from PostgreSQL with Sqoop, the table creation works well. Any ideas? Thanks
For information, Sqoop read well the Oracle table. I ran this command and got the good result:
sqoop eval \
--connect jdbc:oracle:thin:@${DB_HOST}:${DB_PORT}:${DB_NAME} \
--username ${DB_USER} \
--password ${DB_PWD} \
--query "select count(1) from input_table"
The full logs of my error:
16/07/21 18:08:29 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.0.0-169
16/07/21 18:08:29 DEBUG tool.BaseSqoopTool: Enabled debug logging.
16/07/21 18:08:29 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/07/21 18:08:29 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
16/07/21 18:08:29 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
16/07/21 18:08:29 DEBUG sqoop.ConnFactory: Loaded manager factory: org.apache.sqoop.manager.oracle.OraOopManagerFactory
16/07/21 18:08:29 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory
16/07/21 18:08:29 DEBUG sqoop.ConnFactory: Trying ManagerFactory: org.apache.sqoop.manager.oracle.OraOopManagerFactory
16/07/21 18:08:29 DEBUG oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop can be called by Sqoop!
16/07/21 18:08:29 INFO oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop is disabled.
16/07/21 18:08:29 DEBUG sqoop.ConnFactory: Trying ManagerFactory: com.cloudera.sqoop.manager.DefaultManagerFactory
16/07/21 18:08:29 DEBUG manager.DefaultManagerFactory: Trying with scheme: jdbc:oracle:thin:@host:1521:sid
16/07/21 18:08:29 DEBUG manager.OracleManager$ConnCache: Instantiated new connection cache.
16/07/21 18:08:29 INFO manager.SqlManager: Using default fetchSize of 1000
16/07/21 18:08:29 DEBUG sqoop.ConnFactory: Instantiated ConnManager org.apache.sqoop.manager.OracleManager@7d8704ef
16/07/21 18:08:29 INFO tool.CodeGenTool: Beginning code generation
16/07/21 18:08:29 DEBUG manager.OracleManager: Using column names query: SELECT t.* FROM input_table t WHERE 1=0
16/07/21 18:08:29 DEBUG manager.SqlManager: Execute getColumnInfoRawQuery : SELECT t.* FROM input_table t WHERE 1=0
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
16/07/21 18:08:30 DEBUG manager.OracleManager: Creating a new connection for jdbc:oracle:thin:@host:1521:sid/user, using username: user
16/07/21 18:08:30 DEBUG manager.OracleManager: No connection paramenters specified. Using regular API for making connection.
16/07/21 18:08:30 INFO manager.OracleManager: Time zone has been set to GMT
16/07/21 18:08:30 DEBUG manager.SqlManager: Using fetchSize for next query: 1000
16/07/21 18:08:30 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM input_table t WHERE 1=0
16/07/21 18:08:30 DEBUG manager.SqlManager: Found column c1 of type [2, 10, 0]
16/07/21 18:08:30 DEBUG manager.SqlManager: Found column c2 of type [2, 10, 0]
16/07/21 18:08:30 DEBUG manager.SqlManager: Found column c3 of type [12, 20, 0]
16/07/21 18:08:30 DEBUG manager.SqlManager: Found column c4 of type [2, 0, -127]
16/07/21 18:08:30 DEBUG manager.SqlManager: Found column c5 of type [2, 0, -127]
16/07/21 18:08:30 DEBUG manager.SqlManager: Found column c6 of type [12, 80, 0]
16/07/21 18:08:30 DEBUG manager.SqlManager: Found column c7 of type [93, 0, 0]
16/07/21 18:08:30 DEBUG manager.SqlManager: Found column c8 of type [12, 20, 0]
16/07/21 18:08:30 DEBUG manager.OracleManager$ConnCache: Caching released connection for jdbc:oracle:thin:@host:1521:sid/user
16/07/21 18:08:30 ERROR tool.ImportTool: Imported Failed: There is no column found in the target table intput_table. Please ensure that your table name is correct.
--hive-import
will automatically create table in hive mentioned in--hive-table
tag. Put-verbose
in the end of your import query (to get extended logs) and share complete logs – Devtarget table input_table
but you have quotedoutput_table
: make sure${OUTPUT_TABLE}
and${INPUT_TABLE}
values are correct. are all columnsint
type? - try to run without variables (using actual values) first. – Ronak Patel