Im trying to use a window function,for each day of sales of a sku, to have the sum of the last 365 days of qty of the sku. If this was sold on every day then I could use ROWS and PRECEDING etc
ORDER BY
CalendarFullDate ROWS BETWEEN 364 PRECEDING AND CURRENT ROW
But in this case, the dates are not evenly distributed with many days not having a sale (i.e I cant just go back 364 rows and assume a sale is every day).
So with the test/sample below, is it possible to use windowing and some type of where clause so Im only summing back at most 364 days?
WITH samples AS (
SELECT "1" AS SKU, DATE("2018-10-27") AS CalendarFullDate, 86.0 AS DailySalesQty UNION ALL (
SELECT "1" AS SKU, DATE("2018-10-20"), 84.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-09-29"), 88.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-09-14"), 42.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-09-01"), 21.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-05-05"), 25.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-04-28"), 97.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-03-31"), 244.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-03-24"), 68.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-02-23"), 52.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-02-10"), 48.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-01-21"), 243.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-01-18"), 2.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-01-06"), 190.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2017-12-26"), 310.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2017-12-09"), 240.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2017-11-03"), 30.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2017-10-21"), 164.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2017-09-30"), 44.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2017-09-09"), 55.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2017-09-01"), 35.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2017-05-20"), 60.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2017-05-06"), 68.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2017-04-15"), 136.0) UNION ALL (
SELECT "2" AS SKU, DATE("2018-10-24"), 46.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-10-18"), 56.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-09-16"), 19.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-09-02"), 42.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-09-01"), 45.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-07-05"), 25.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-06-28"), 210.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-05-31"), 44.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-05-24"), 168.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-04-23"), 152.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-03-10"), 8.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-02-21"), 23.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-01-18"), 20.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-01-06"), 10.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2017-12-26"), 30.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2017-11-09"), 1240.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2017-11-03"), 323.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2017-10-21"), 123.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2017-09-30"), 444.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2017-09-09"), 555.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2017-08-01"), 35.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2017-06-20"), 6.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2017-05-06"), 68.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2017-04-15"), 136.0) UNION ALL (
SELECT "2" AS SKU, DATE("2017-04-09"), 136.0)
)
SELECT
SKU,
CalendarFullDate,
SUM(DailySalesQty) OVER(win)
FROM
samples WINDOW win AS (
PARTITION BY
SKU
ORDER BY
CalendarFullDate
RANGE BETWEEN DATE_TRUNC(CalendarFullDate,INTERVAL 364 DAY) AND CalendarFullDate)
I know above you cant do for RANGE, but its a sort of pseudo code for what I actually want to do. I tried a where clause but thats not allowed.
Is this even possible using windowing? Its a nice clean way to do it but not sure if I can express such a condition for the windowed aggregate?
Note: this is a cut down version of the real data which has 5 fields as the partition, and 20 odd measures to aggregate as well and is a huge dataset (1 TB) so wanting it to be efficient as well.
Thoughts?
Cheers!