2
votes

I know tons of functions calculate the rolling median, but I could not find anything that calculates the weighted rolling median (I found ema, but that's average). Here is what I have tried

*** edited on Jan 31 2019: I have found the code works fine when I only group by V2. The error occurs only when I group by V2:V4

library(spatstat)
library(data.table)
library(zoo)


a <- data.table(V1 = c(rep(NA, 10), runif(90)), 
                V2 = c(rep('good', 50), rep('bad', 50)),
                V3 = c(rep('monday', 70), rep('friday', 30)),
                V4 = c(rep('male', 90), rep('female', 10)))
a <- a[,'lag1':=lag(V1, n = 1), by = .(V2)]
set.seed(55)
rn <- runif(45)
w <- sort(rn/sum(rn), decreasing = T)

weight_median_calc <- function(u){
  weighted.median(x = u,
                w = w)
}

a <- a[,'roll_weighted_median':= 1][,roll_weighted_median:=rollapply(data = lag1,
                                                                   width = 45,
                                                                   FUN = weight_median_calc,
                                                                   by.column = FALSE,
                                                                   align = 'right',
                                                                   fill = NA
),
by = .(V2, V3, V4)][]

Error in [.data.table(a[, :=("roll_weighted_median", 1)], , :=(roll_weighted_median, : Type of RHS ('logical') must match LHS ('double'). To check and coerce would impact performance too much for the fastest cases. Either change the type of the target column, or coerce the RHS of := yourself (e.g. by using 1L instead of 1)

1
I tried your code and it is giving me errors unrelated to your question (initially unused argument (na.rm = TRUE), then when I added that to your weight_median_calc, I get 'x' and 'w' must have the same length). Similarly, I had to add library(data.table);library(zoo) to the code, please include all non-base R packages in use. I suggest you start a fresh R session (no restore!) and use your code, fix the errors, then update your question.r2evans
@r2evans thanks for the suggestion! I have edited it.littleturtle

1 Answers

3
votes

There are quite a few problems with your code. The main issue which is giving your current error is caused by the fill = NA argument to rollapply. By default, NA is of type logical, which clashes when we try to assign it into a numeric vector using :=. So instead use fill = as.numeric(NA) - like this:

a[, roll_weighted_median := rollapply(
  data = lag1, width = 45, FUN = weight_median_calc,
    by.column = FALSE, align = 'right', fill = as.numeric(NA)),
  by = .(V2, V3, V4)][]

Another possible problem with your code is that weight_median_calc will throw an error if it is passed only NA values. We can rewrite it like this to avoid these errors

weight_median_calc <- function(u){
  if (!all(is.na(u))) 
    weighted.median(x = u, w = w[1:length(u)]) 
  else as.numeric(NA)
}

A third issue to fix is your use of lag. lag does not have an n= argument. In data.table, you should probably use shift instead

a[, lag1 := shift(V1, 1), by = .(V2)]

The final thing you should be aware of is that in data.table one should not use <- assignment in conjunction with := assignment. := has already made the assignment in place, so no need to copy the results again using <-. In other words, don't do a <- a[, foo := bar]. Just do a[, foo := bar]