2
votes

I have a matrix. Here are the conditions:

(1) Sum of values devoiding four or more consecutive zeros in each column.

(2) Obtain the maximum for each column and store these maximum values in a vector

Example:

v1 <- c(2,4,6,1,0)
v2 <- c(1,0,1,9,0)
v3 <- c(0,0,3,0,1)
v4 <- c(0,0,2,0,10)
v5 <- c(0,0,13,0,7)
v6 <- c(0,20,9,0,2)
mat1 <- rbind(v1, v2, v3, v4, v5, v6)
## Replace four or more zeros by NA
fill_NA <- function(X, zero_val=0, new_val= NA){   
  apply(X,2,function(x){
       r <- rle(x)
       r$values[ r$lengths > 3 & r$values == zero_val ] <- new_val
       inverse.rle(r)
      })
}
fill_NA(mat1)
 mat2 <-  fill_NA(mat1)
> mat2
 [,1] [,2] [,3] [,4] [,5]
[1,]    2    4    6    1    0
[2,]    1   NA    1    9    0
[3,]   NA   NA    3   NA    1
[4,]   NA   NA    2   NA   10
[5,]   NA   NA   13   NA    7
[6,]   NA   20    9   NA    2

Now, all I want is to find the maximum of sum of values which are separated by NA's for first column = max of 3 = 3, second column = max of 4 and 20 = 20, third column = max of 34 = 34, fourth column = max of 10 = 10, fifth column = max of 20 = 20.

So the final output should be stored in a vector. Any help or better algorithm in this regard is appreciated.

1

1 Answers

5
votes

As it is a matrix, we can loop through the columns with apply specifying MARGIN as 2, then create a grouping variable with rle based on the occurence of NA, use tapply to get the sum grouped by the created group and get the max

apply(mat2, 2, function(x)  {
          rl <- rle(is.na(x))
          rl$values <- seq_along(rl$values)
          max(tapply(x, inverse.rle(rl), FUN = sum, na.rm = TRUE))
  })
#[1]  3 20 34 10 20