0
votes

I have few files stored in HDFS in parquet format and I am trying to create a new external table in hive which is suppose to point to that data files.

So, I created a table in hive using -

CREATE EXTERNAL TABLE ORDERS_P (
ORDERID INT, 
ORDER_DATE BIGINT,
CUSTOMER_ID INT,
STATUS STRING)
STORED AS PARQUET
LOCATION 'hdfs:///user/cloudera/retail/parquet/orders';

Table is created but when i run query on table in hive as -

SELECT * FROM ORDERS_P LIMIT 10

it returns all NULL values as except ORDER_DATE Column -

NULL    1402729200000   NULL    NULL
NULL    1402729200000   NULL    NULL
NULL    1402729200000   NULL    NULL
NULL    1402729200000   NULL    NULL
NULL    1402729200000   NULL    NULL
NULL    1402729200000   NULL    NULL
NULL    1402729200000   NULL    NULL
NULL    1402729200000   NULL    NULL
NULL    1402729200000   NULL    NULL
NULL    1402729200000   NULL    NULL

I verified data does exists in those files using spark-shell. Not sure what I am doing wrong. Any help appreciated.

1
what is your hive version ? - Arunakiran Nulu
Hive 1.1.0-cdh5.8.0 - Pushkr

1 Answers

0
votes

You may need to upgrade your hive version to 1.2 or latest , in 1.1.0 or earlier , all the data types of parquet are not support.

Please check the link ,after 1.2.0 the support has been added.