I have a hive table(consumer_data) with partition column 'val_dt' which is a string column having values in the date format 'yyyy-MM'.
I have multiple partitions in the table, from '2015-01' to '2020-04'. Each month the data is added incrementally to the table so the next partition added would be '2020-05'.
I want to drop the partitions that are older than 24 months. This partition delete is done every month to retain only the last 24 months data.
How can the ALTER command be framed to drop the partitions with values older than 24 months.
I ran the below beeline query with sample data through linux shell but it deleted all the partitions.
beeline --hivevar var_drop_date="$(date -d "24 months ago" '+%Y-%m')" -e 'ALTER TABLE consumer_data DROP IF EXISTS PARTITION(val_dt <= date "${hivevar:var_drop_date}")'
Partition present before running the query:
val_dt=2016-01
val_dt=2017-01
val_dt=2019-01
val_dt=2020-01
Partitions present after running the query:
None
Required output:
val_dt=2019-01
val_dt=2020-01