Hive external table is unable to read already partitioned hdfs directory

Question

I have a map reduce job, that already writes out record to hdfs using hive partition naming convention.

eg

/user/test/generated/code=1/channel=A
/user/test/generated/code=1/channel=B

After I create an external table, it does not see the partition.

 create external table test_1 ( id string, name string ) partitioned by
 (code string, channel string) STORED AS PARQUET LOCATION
 '/user/test/generated'

Even with the alter command

alter table test_1 ADD PARTITION (code = '1', channel = 'A')

, it does not see the partition or record, because

select * from test_1 limit 1 produces 0 result.

If I use empty location when I create external table, and then use load data inpath ... then it works. But the issue is there is too many partitions for the load data inpath to work.

Is there a way to make hive recognize the partition automatically (without doing insert query)?

Could you please share some sample code of map reduce job that writes output to the hdfs as hive partition naming convention? Let me know I can raise a question pertaining to this. Thanks. Any help would be appreciated. :-) — vikrant rana

user447359 user447359 · Accepted Answer · 2016-03-07T00:29:22

Using msck, it seems to be working. But I had to exit the hive session, and connect again.

MSCK REPAIR TABLE test_1

Hive external table is unable to read already partitioned hdfs directory

1 Answers