rolling regression of each window with groupby

Question

I have a monthly data for each group, and I want to do regression for each group, and window is 2 years and get slope.

I tried several methods

I tried to use for loop and filter 2-year data each time, then do lm

df$Year =year(df$date)
df1<-purrr::map_df(min(df$Year):(max(df$Year) - 2), function(i) {
             df %>%
              filter(Year%in% c(i,i+1)) %>%
              group_by(group)%>%
    do(lm = lm(R ~ M, data = .,na.action=na.exclude)) %>%
      mutate(lm_b0 = summary(lm)$coeff[1],
             lm_b1 = summary(lm)$coeff[2])%>%
    ungroup()
  })

get the error:

Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 0 (non-NA) cases

then I tried to define a function slope, but result df1, column "slope" is all NA

slope <- . %>% { cov(.[, 2], .[, 1]) / var(.[, 2])}
df1<-df %>%
       group_by(group,Year) %>%
       mutate(slope = rollapplyr(cbind(R, M), 2, slope, by.column = FALSE, fill = NA)) %>%ungroup()

and I found roll_lm method:

library(roll)
df1<-df%>%
  group_by(group,Year)%>%
  roll_lm(MKT,RET,2)

get the error:

Error in roll_lm(., M, R, 5) : object 'M' not found

4.The I tried this, but result is all NA

df1<-df%>%
  group_by(group,Year)%>%
  do(data.frame(., rolling_coef = rollapplyr(data = ., width = 2, FUN = function(df_) {
    mod = lm(R ~ M, data = .)
    return(coef(mod)[2])
  }, by.column = FALSE, fill = NA)))

I also tried use beta directly, it could work, but seems beta is for nonnegative numeric variable, my M and R have negative value.

Could someone give me some ideas? Thank you!

My dataset is similar to this:

and I want the result be: (2002: use previous two-year monthly data do regression and find slope)

sorry. I did not use reprex before. And it always gets error. I made a sample code and data file in github. github.com/lingqi-w/testing.git — ling
@ling They mean a minimal reproducible example, see: stackoverflow.com/a/5963610/6574038 — jay.sf
set.seed(1) Data <- data.frame( group = sample(letters[1:4], 500, replace = TRUE), date= sample(seq(as.Date('2000/01/01'), as.Date('2003/01/01'),by="m"), 500, replace = TRUE), R=sample(runif(20),500,replace = TRUE), M=sample(runif(10),500,replace = TRUE) ) Thank you @jay.sf It is really helpful. I created a sample dataset. — ling

jay.sf jay.sf · Accepted Answer · 2020-10-08T16:53:05

Perhaps you want this.

res <- do.call(rbind, lapply(0:6, function(y) 
  do.call(rbind, by(d, d$g, function(g) {
    b <- unname(lm(R ~ M, g[substr(g$t, 1, 4) %in% (2000:2001 + y), ])$coe)
    data.frame(group=g$g[[1]], date=2002 + y, b0=b[1], b1=b[2])
  }))
  ))
res
#    group date          b0          b1
# a      a 2002  0.11269711  0.50982041
# b      b 2002 -0.18383806  0.77395640
# a1     a 2003 -0.04830032  0.38158442
# b1     b 2003  0.05165555 -0.02668866
# a2     a 2004  0.04793637 -0.15872739
# b2     b 2004  0.27075037 -0.28167401
# a3     a 2005 -0.17276432 -0.41435303
# b3     b 2005  0.21656421 -0.22557376
# a4     a 2006 -0.29442007  0.45387752
# b4     b 2006 -0.02507721 -0.09527979
# a5     a 2007 -0.11417319  0.54403146
# b5     b 2007 -0.34265074 -0.33346084
# a6     a 2008 -0.13838241  0.33467995
# b6     b 2008 -0.02713155  0.25299279

Data:

set.seed(42)
d <- expand.grid(g=letters[1:2], t=seq(as.Date("2000-01-01"), length.out=96, by="m"),
                 stringsAsFactors=F)
d <- transform(d, R=round(rnorm(nrow(d)), 2), M=round(rnorm(nrow(d))/2, 2))

rolling regression of each window with groupby

1 Answers