dplyr - summarise with condition

Question

I have this data frame:

library(dplyr)
library(tidyr)

data <- tribble(
  ~Date, ~A1, ~A2,~B1,~B2,
  as.Date("2019-01-01"), 20, 10,20, 10,
  as.Date("2019-01-01"), 20 ,5,20,5,
  as.Date("2019-01-01"), 10, 2,10,20,
  as.Date("2019-01-01"), 20, 60,0,0,
  as.Date("2019-01-01"), 30, 4,20,5,
  as.Date("2019-02-01"), 0, 0,16,8,
  as.Date("2019-02-01"), 0, 0,0,40,
  as.Date("2019-02-01"), 0, 0,4,2,
  as.Date("2019-02-01"), 4, 8,10,6,
  as.Date("2019-02-01"), 6, 3,0,0,
  as.Date("2019-03-01"), 20, 8,23,9,
  as.Date("2019-03-01"), 60, 4,0,0,
  as.Date("2019-03-01"), 4, 2,8,3,
  as.Date("2019-03-01"), 0, 6,10,0
)

For each day I want to calculate the mean of (A1-B1) and of (A2-B2).
For A1-B1 I want to use only rows where A1>B1 and A1>0,B1>0.
For A2-B2 I want to use only rows where A2>B2 and A2>0,B2>0.

This is what I have tried:

data_mean = data %>%
    group_by(Date) %>%
    dplyr::summarise(
      mean_1 = mean(A1[A1>=B1 & A1>0 & B1>0] - B1[A1>=B1 & A1>0 & B1>0]),
      mean_2 = mean(A2[A2>=B2 & A2>0 & B2>0] - B2[A2>=B2 & A2>0 & B2>0]))

Is there a way to use the filter function while using summarise function? Or a smarter way to apply my code?

TarJae TarJae · Accepted Answer · 2021-04-04T19:12:30

Update: Thank to akrun!!! Now it works!

data %>%  
  filter(if_all(where(is.numeric),  ~ . > 0)) %>% 
  mutate(i1 = A1 >= B1, i2 = A2 >= B2) %>% 
  group_by(Date) %>% 
  summarise(mean1 = mean(A1[i1] - B1[i1]), mean2 = mean(A2[i2] - B2[i2]))

Output:

  Date       mean1 mean2
  <date>     <dbl> <dbl>
1 2019-01-01   2.5     0
2 2019-02-01 NaN       2
3 2019-03-01 NaN     NaN

First version I get almost akrun's solution. But can't handle the minus numbers

data %>% 
  group_by(Date) %>% 
  filter_if(is.numeric, all_vars((.) != 0)) %>% 
  filter(A1>=B1 | A2>=B2) %>% 
  summarise(mean1 = mean(A1-B1),
            mean2 = mean(A2-B2))

Output:

  Date       mean1 mean2
  <date>     <dbl> <dbl>
1 2019-01-01   2.5 -4.75
2 2019-02-01  -6    2

dplyr - summarise with condition

3 Answers