0
votes

add jar /path to/hive-serdes-1.0-SNAPSHOT.jar;

CREATE EXTERNAL TABLE student

( id int, student_id INT, type STRING, score DOUBLE

)

STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler'

WITH SERDEPROPERTIES ( 'mongo.columns.mapping'='{ "id":"_id", "student_id":"student_id", "type":"type","score":"score" }' )

TBLPROPERTIES('mongo.uri'='mongodb://****---****.nam.nsroot.net:*****/admin.student');

I am able to successfully run the code and ingest data. But the "id" field gets populated as NULL. Should i change the data type ? I tried STRING as well. Got the same result.

1

1 Answers

0
votes

According to the mongo-hadoop Hive SerDe, ObjectId corresponds to a special instance of STRUCT.

A Hive field corresponding to an ObjectId must be a STRUCT with the fields oid, a STRING, and bsontype, an INT, and nothing else. The oid is the string of the ObjectId while the bsontype should always be 8. Per your example, it should be :

CREATE EXTERNAL TABLE student
(id STRUCT<oid:STRING, bsontype:INT>, student_id INT, type STRING, score DOUBLE)

Where the output would be something similar to:

{"oid":"56d6e0f6ff1f17f74ebbc16c","bsontype":8} 
{"oid":"56d6e0f8ff1f17f74ebbc16d","bsontype":8}
...

The above was tested with: MongoDB v3.2.x, mongo-java-driver-3.2.2.jar, mongo-hadoop-core-1.5.0-rc0.jar, mongo-hadoop-hive-1.5.0-rc0.jar.