2
votes

I encountered a strange HIVE behavior which i can not completely understand. I really hope someone could shad some light on that matter.

I am using the following script to create and add data to a table:

DROP TABLE myTbl;
CREATE EXTERNAL TABLE myTbl (
    Col1 string,
    Col2 string,  
    Col3 string,
    Col4  string,
)
PARTITIONED BY (partition STRING)
LOCATION '${hiveconf:Valid_location_on_aws_s3_with_the_proper_file_to_create_table}';

Hive runs the script returns OK but (Here is the problem) the table is empty. run: select * from myTbl; returns nothing

The solution for this problem was rather simple after the script i run the following command/script:

MSCK REPAIR TABLE myTbl

now when executing select on that table i can see all the data that was in the file.

What i can not understand is why I need to use the repair in order to see the data in the table.

P.S. using the LOAD INTO TABLE works perfectly without any need to repair.

1
Please don't use an ALL-CAPS title.user456814

1 Answers

1
votes

It might have to do with the case that DROP TABLE removes metadata for all table's partitions. However CREATE TABLE doesn't recreate this info in metastore. So, after issuing CREATE TABLE Hive doesn't know yet that there are partitions and there is data in them. MSCK REPAIR TABLE recreates this information in metastore (or you could do it manually with ALTER TABLE tbl ADD PARTITION).