5
votes

I am hoping to run an import into Hive on a cron, and was hoping just using

"load data local inpath '/tmp/data/x' into table X" into a table would be sufficient.

Will subsequent commands overwrite whats already in the table? or will it append?

2

2 Answers

7
votes

This site http://wiki.apache.org/hadoop/Hive/LanguageManual is your friend when dealing with Hive. :)

The page that addresses loading data into Hive is http://wiki.apache.org/hadoop/Hive/LanguageManual/DML That page states that

if the OVERWRITE keyword is used then the contents of the target table (or partition) will be deleted and replaced with the files referred to by filepath. Otherwise the files referred by filepath will be added to the table. Note that if the target table (or partition) already has a file whose name collides with any of the filenames contained in filepath - then the existing file will be replaced with the new file.

In your case, you are not using the OVERWRITE keyword, so the files will be added to the table. (Unless they are the same files, in which case they are overwritten)

0
votes

If the OVERWRITE keyword is used then the contents of the target table (or partition) will be deleted and replaced by the files referred to by filepath; otherwise the files referred by filepath will be added to the table.