2
votes

Imported the table from mysql to parquet and then created external table in hive. Somehow, when I query the external table on hive, it shows all values as null. Although the parquet-tools cat xyz.parquet file shows the contents properly. Where am I doing the mistake ?

sqoop import --connect jdbc:mysql://quickstart.cloudera:3306/retail_db \
--username root --password cloudera \
--table order_items --split-by page_id \
--target-dir hdfs:/user/cloudera/proj/order_items \
--compress --compression-codec snappy \
--as-parquetfile \
--num-mappers 1

Create external table hiveorderitems_par
(ord_item_id int, 
 ord_item_ord_id int, 
 ord_item_prod_id int, 
 ord_item_quantity int, 
 ord_item_subtotal float,  
 order_item_prod_price float) 
 row format SERDE 'parquet.hive.serde.ParquetHiveSerDe' 
 STORED AS INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat" 
 OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat" 
 location "/user/cloudera/proj/order_items";

hive> select * from hiveorderitems_par;

NULL    NULL    NULL    NULL    NULL    NULL
NULL    NULL    NULL    NULL    NULL    NULL
NULL    NULL    NULL    NULL    NULL    NULL
Time taken: 0.225 seconds, Fetched: 172198 row(s)
hive> 
2
can you add the describe of the table that you are importing? are the columns case matching? there used to be a bug for case sensitive in hive/parquet tables - hlagos
Wow ! @hlagos You are right ! While defining the external table, I was specifying the column names as ord_item_id, ord_item_ord_id and so on.... but in SQL table and parquet file, the column names were order_item_id, order_item_order_id etc... So, basically the column names has to be exactly the same not just case sensitive. This was new to me. Is this a bug you said ? Thanks anyways, I changed the column names and it worked perfectly! Thanks - Gurpreet Singh
I just answer below. - hlagos
@GurpreetSingh, the bug was only specific to case sensitivity. - Andrew

2 Answers

4
votes

You need to make sure that the names used during the table creation matches the names in the table that you are importing. It should solve the problem.

Before Hive 0.14 the parquet names were case sensitive. You can find the details here

1
votes

I had similar issue and is fixed by ensuring that hive columns and parquet columns are both in lower case