I have created a hive table with dynamic partitioning on a column. Is there a way to directly load the data from files using "LOAD DATA" statement? Or do we have to only depend on creating a non-partitioned intermediate table and load file data to it and then inserting data from this intermediate table to partitioned table as mentioned in Hive loading in partitioned table?
1
votes
2 Answers
0
votes
No, the LOAD DATA command ONLY copies the files to the destination directory. It doesn't read the records of the input file, so it CANNOT do partitioning based on record values.
If your input data is already split into multiple files based on partitions, you could directly copy the files to table location in HDFS under their partition directory manually created by you (OR just point to their current location in case of EXTERNAL table) and use the following ALTER command to ADD the partition. This way you could skip the LOAD DATA statement altogether.
ALTER TABLE <table-name>
ADD PARTITION (<...>)