1
votes

I'm trying to delete data from external and partitioned table in hive. I can delete partitions with:

ALTER TABLE myTable DROP PARTITION(field > 'xxxx') or TRUNCATE TABLE myTable PARTITION(field)

But related files in Blob storage are not deleted. How do I delete those files?

In other hand, I'd like to delete data using any field as a filter (not only partition field). Can it be done in my case (in an external and partitioned table)? I've tried to achive this using:

INSERT OVERWRITE TABLE myTable PARTITION(field) SELECT * FROM myTable WHERE machine = 'xxxxx'

But data in SELECT doesn't replace data in myTable.

1

1 Answers

0
votes

Data in the external table will remain if you drop table or partition. Only if the table is managed, the data will be deleted automatically when the table or partition is deleted.

INSERT OVERWRITE TABLE myTable PARTITION(field) SELECT... 

statement can replace data with newly loaded data for partitions existing in the returned dataset. If returned dataset is empty, the data will remain untouched.

To delete data in external table you need to delete files on the filesystem.