3
votes

This is my dataset:

set.seed(327)

ID <- seq(1:50)

mou <- sample(c(2000, 2500, 440, 4990, 23000, 450, 3412, 4958, 745, 1000),
  50, replace=TRUE)

calls <- sample(c(50, 51, 12, 60, 90, 16, 89, 59, 33, 23, 50, 555),
  50, replace=TRUE)

rev <- sample(c(100, 345, 758, 44, 58, 334, 888, 205, 940, 298, 754),
  50, replace=TRUE)

dt <- data.frame(mou, calls, rev)

My motive is to find the mean of mou where calls is greater than 34 and less than 200 and rev greater than 100 and less than 400. I started approaching this problem by using dplyr but I am not so sure how to correctly use the desired expression inside filter function.

dt %>% filter(???) %>% summarize(mean_mou=mean(mou))

Could you please guide how to frame this expression inside filter correctly.

3

3 Answers

1
votes

You can put your conditionals in the filter function. You're almost there in your example :-)

########
# Setup
########
set.seed(327) # Setting a seed makes the example reproducible

ID <- seq(1:50)
mou <-
  sample(c(2000, 2500, 440, 4990, 23000, 450, 3412, 4958, 745, 1000),
         50,
         replace = TRUE)
calls <-
  sample(c(50, 51, 12, 60, 90, 16, 89, 59, 33, 23, 50, 555), 50, replace = TRUE)
rev <-
  sample(c(100, 345, 758, 44, 58, 334, 888, 205, 940, 298, 754), 50, replace = TRUE)

dt <- data.frame(mou, calls, rev)

library(tidyverse)

########
# Here's the direct answer to your question
########
dt %>%
  filter(calls > 34 & calls < 200) %>% 
  filter(rev > 100 & rev < 400) %>% # Using two filters makes things more readable
  summarise(mean_mou = mean(mou))

# 3349
5
votes

For completeness:

If the logic is AND you can simply add multiple condition after a comma:

df %>%
     filter(calls > 34, calls < 200, rev > 100, rev < 400)

If the logic is OR you must use the usual logic or symbol: |

df %>%
  filter(calls > 34 | rev > 100)

Chaining them together work, but one must pay attention to what is done. For example:

df %>%
  filter(calls > 34, calls < 200 | rev > 100, rev < 400)

means calls > 34 AND (calls < 200 OR rev > 100) AND rev < 400

0
votes
dt %>% 
  filter(., calls > 40 & calls < 200 & rev > 100 & rev <400)  %>%
  summarise( mean(mou))

  mean(mou)
1  2403.333