2
votes

I am looking to compute a moving average by group in a data.table with an adaptive window so that there are no NAs at the beginning of the time series. I know how to do this with frollmean and setting adaptive = TRUE (see for instance jangorecki's response in this thread). I can get the same code to work when all groups in my data.table are of the same length but run into errors when the groups are of different size.

For example, if my data is

tmp = data.table(Gp = c(rep('A',6),rep('B',4)), Val = c(1,3,4,6,2,2,8,5,7,10))

and I am doing a moving average of length 3, then the desired response is

> desired_output
    Gp  Val
 1:  A 1.00
 2:  A 2.00
 3:  A 2.67
 4:  A 4.33
 5:  A 4.00
 6:  A 3.33
 7:  B 8.00
 8:  B 6.50
 9:  B 6.67
10:  B 7.33

I tried the following:

mov_window_len = vector("list",2)
mov_window_len[[1]] = c(1,2,rep(3,4))
mov_window_len[[2]] = c(1,2,rep(3,2))
tmp[,lapply(.SD, frollmean, n = mov_window_len, align = "right", adaptive = TRUE), by = Gp]

but I get an error saying length of integer vector(s) provided as list to 'n' argument must be equal to number of observations provided in 'x'

Any help in resolving this will be much appreciated. Thanks in advance.

1

1 Answers

3
votes

You can use the group index .GRP to subset mov_window_len. This will give you the right lengths for each group. You only want to take frollmean of Val, so no need for lapply.

tmp[, frollmean(Val, n = mov_window_len[.GRP], align = "right", adaptive = TRUE), by = Gp]

#     Gp       V1
#  1:  A 1.000000
#  2:  A 2.000000
#  3:  A 2.666667
#  4:  A 4.333333
#  5:  A 4.000000
#  6:  A 3.333333
#  7:  B 8.000000
#  8:  B 6.500000
#  9:  B 6.666667
# 10:  B 7.333333

Alternatively window length can be added to input data.table (Len field below), as it corresponds to each row.

tmp[Gp=="A", Len:=mov_window_len[[1]]
    ][Gp=="B", Len:=mov_window_len[[2]]
     ][, .(Val, Len, RollVal=frollmean(Val, Len, adaptive=TRUE)), by=Gp]
#    Gp Val Len  RollVal
# 1:  A   1   1 1.000000
# 2:  A   3   2 2.000000
# 3:  A   4   3 2.666667
# 4:  A   6   3 4.333333
# 5:  A   2   3 4.000000
# 6:  A   2   3 3.333333
# 7:  B   8   1 8.000000
# 8:  B   5   2 6.500000
# 9:  B   7   3 6.666667
#10:  B  10   3 7.333333