2
votes

I am facing an issue.

So when I try to import mongo data to hive using the below command it is giving me an error.

CREATE EXTERNAL TABLE gok
(
id STRING,
name STRING,
state STRING,
email STRING) STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler'
WITH SERDEPROPERTIES('mongo.columns.mapping'='{"id":"_id","name":"name","state":"state"}') TBLPROPERTIES('mongo.uri'='mongodb://localhost:27017/gokul_test.play_test');

Note:

The versions of the tools used are below:

  • Java JDK 8
  • Hadoop: 2.8.4
  • Hive: 2.3.3
  • MongoDB: 4.2

The jar versions are of below which has been moved to HADOOP_HOME/lib and HIVE_HOME/lib:

  • mongo-hadoop-core-2.0.2.jar
  • mongo-hadoop-hive-2.0.2.jar
  • mongo-java-driver-2.13.2.jar

So the error is

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org/apache/hadoop/hive/serde2/SerDe

I have tried by manually adding jars in hive then the error which I have received is below.

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.com/mongodb/hadoop/hive/BSONSerDe

Both the errors are different.

let me know if you know any resolution or need more details.

2
Detailed error message should be available in the HiveServer2 logs. Please provide full error messages.davidemm

2 Answers

0
votes

You should add the jars to your hive session.

Which hive client are you using?

If you were using "beeline", you can add the full path of the jars before trying to create the table:

beeline !connect jdbc:hive2://localhost:10000 “” ””

So, as soon as your session is created, you must add the jars, using "add jar" and the full path of the jar file:

add jar hdfs://sandbox.hortonworks.com:8020/tmp/udfs/mongo-hadoop-hive-1.5.0-SNAPSHOT.jar;
add jar hdfs://sandbox.hortonworks.com:8020/tmp/udfs/mongo-hadoop-core-1.5.0-SNAPSHOT.jar;
add jar hdfs://sandbox.hortonworks.com:8020/tmp/udfs/mongodb-driver-3.0.4.jar;

So the next step is to drop/create the table

DROP TABLE IF EXISTS bars;

CREATE EXTERNAL TABLE bars
(
objectid STRING,
    Symbol STRING,
    TS STRING,
    Day INT,
    Open DOUBLE,
    High DOUBLE,
    Low DOUBLE,
    Close DOUBLE,
    Volume INT
)
STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler'
WITH SERDEPROPERTIES('mongo.columns.mapping'='{"objectid":"_id",
 "Symbol":"Symbol", "TS":"Timestamp", "Day":"Day", "Open":"Open", "High":"High", "Low":"Low", "Close":"Close", "Volume":"Volume"}')
TBLPROPERTIES('mongo.uri'='mongodb://localhost:27017/marketdata.minibars');

source: https://community.cloudera.com/t5/Support-Questions/Mongodb-with-hive-Error-return-code-1-from-org-apache-hadoop/td-p/138161

-1
votes

It looks like the mongo-hadoop-hive-<version>.jar is not correctly added into the hive system.

Try adding the mongodb JAR using the below command:

ADD JAR /path-to/mongo-hadoop-hive-<version>.jar

More info: https://github.com/mongodb/mongo-hadoop/wiki/Hive-Usage

Alternatively: you could also try to ingest the mongodb BSON data into hive in an AVRO format and then build tables in hive. Its a long process but it will get your job done. You will need to build a new connector for reading from mongo and converting it to avro format.