I got a hive table partitioned by year, month and day
CREATE TABLE t1 (
...
)
PARTITIONED BY (
year INT,
month INT,
day INT'
)
STORED AS PARQUET;
and I need to take the data from it last 7 days/partitions. The first step towards finding how to get the last seven days of data like:
val formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd")
val runDate = date
val runDay = LocalDate.parse(runDate.toString, formatter)
val runDayMinus7 = runDay.minusDays(7)
But then I face the troll under the bridge. Is there any way to select data between those two dates by multiple partitions? Otherwise, Is there some way I'd not need to change the multiple partitions to a single partition YYYYMMDD?
Either scala, either hive.
Thanks in advance