Starting with the following:
library(tidyverse)
library(lubridate)
df <- tibble(
date = seq.Date(ymd("2018-01-01"), by = "month", length.out = 6),
y = c(20, 10, 15, 35, 40, 50)
)
df
#> # A tibble: 6 x 2
#> date y
#> <date> <dbl>
#> 1 2018-01-01 20
#> 2 2018-02-01 10
#> 3 2018-03-01 15
#> 4 2018-04-01 35
#> 5 2018-05-01 40
#> 6 2018-06-01 50
I would like to create a new column, z that is a recursive rolling-6-period average. That is, for 2018-07-01 this is simply the average of the last six records, but for 2018-08-01 forward, we use the (relevant) previously computed rolling average(s) in the new rolling calculation.
2018-07-01 = mean(c(20, 10, 15, 35, 40, 50)) = 28.3333 2018-08-01 = mean(c(10, 15, 35, 40, 50, 28.3333)) = 29.7222 2018-09-01 = mean(c(15, 35, 40, 50, 28.3333, 29.7222) = 33.0093 ...etc...
I have tried a few things with tibbletime::rollify and zoo::rollmeanr, but neither allow me to recursively reference the last computed rolling average.
Desired Output:
desired_df <- tibble(
date = seq.Date(ymd("2018-01-01"), by = "month", length.out = 22),
y = c(20, 10, 15, 35, 40, 50, rep(NA, 16)),
z = c(
rep(NA, 6),
28.3333, 29.7222, 33.0093, 36.0108, 36.1793, 35.5425, 33.1329,
33.9328, 34.6346, 34.9055, 34.7213, 34.4783, 34.3009, 34.4955,
34.5893, 34.5818
)
)
desired_df
#> # A tibble: 22 x 3
#> date y z
#> <date> <dbl> <dbl>
#> 1 2018-01-01 20 NA
#> 2 2018-02-01 10 NA
#> 3 2018-03-01 15 NA
#> 4 2018-04-01 35 NA
#> 5 2018-05-01 40 NA
#> 6 2018-06-01 50 NA
#> 7 2018-07-01 NA 28.3
#> 8 2018-08-01 NA 29.7
#> 9 2018-09-01 NA 33.0
#> 10 2018-10-01 NA 36.0
#> # ... with 12 more rows


dfand end with a 22 rowdesired_df. Why 22, and not, say, 50? - duckmayr