This very simple SQL can calculate averages, medians etc. for a well defined periods, like year, month, quarter, week, day:
SELECT
date_trunc('year', t.time2), -- or hour, day, week, month, year
count(1),
percentile_cont(0.25) within group (order by t.price) as Q1,
percentile_cont(0.5) within group (order by t.price) as Q2,
percentile_cont(0.75) within group (order by t.price) as Q3,
avg(t.price) as A,
min(t.price) as Mi,
max(t.price) as Mx
FROM my_table AS t
GROUP BY 1
ORDER BY date_trunc
The table contains list of individual transactions with date (timestamp) and price (bigint).
However, I am struggling to adapt it to calculate running/ moving values (eg. 4 weeks, or 6 months, or 2 quarters, or 12 months). How can this be achieved?
EDIT This is how data looks like:
And this is expected result:
EDIT 2:
Another issue I encounter is that there should be a full set of data included in moving avg, median, etc. calculations.
What I mean is that if data series starts in Jan-2000, then the first '12 month moving avg' that is meaningful can only be calculated in Dec-2000 (ie. first month that contains full 12 months of data). In case of 3 months moving avg, the first meaningful value will be in Mar-2000, etc.
So, I am thinking, that the logic for this query should be to:
1) determine start and end date to use in calculating avg, median etc stats , then
2) Loop over avg, median, etc. calculations FOR EACH start-end date pair.
To illustrate, the first part could be:
WITH range_values AS ( -- get min and max values for the data series
SELECT date_trunc('month', min(time2)) as minval,
date_trunc('month', max(time2)) as maxval
FROM my_table),
period_range(d) AS ( -- generate complete list of periods eg. weeks, months, years for the data series
SELECT generate_series(minval, maxval, '1 month'::interval) as timeint
FROM range_values
),
lookup_range AS ( -- generate start-end date pairs based on the data series
select d as enddate, d- interval '11month' as startdate
from period_range
)
SELECT startdate, enddate
from lookup_range, range_values as p
where enddate >= p.minval + interval '11month'; -- clip date range to calculate 12 months avg using 12 months of data only
The second part could be (not a valid query but just to illustrate the logic):
SELECT
count(1),
percentile_cont(0.5) within group (order by t.price) as median_price,
avg(t.price) as avg_price
FROM my_table as t, lookup_range as l
WHERE t.time2>= 'startdate' AND t.time2 < 'enddate'
Now, the challenge is how to combine the two? And how make it working with the minimum lines of code?
range between '1 month' preceding and current row
but without some more sample data and expected output this is hard to tell – a_horse_with_no_name