15
votes

I feel like there should be an efficient way to mutate new columns with dplyr using case_when and contains, but cannot get it to work.

I understand using case_when within mutate is "somewhat experimental" (as in this post), but would be grateful for any suggestions.

Doesn't work:

library(tidyverse)

set.seed(1234)

x <- c("Black", "Blue", "Green", "Red")

df <- data.frame(a = 1:20, 
                 b = sample(x,20, replace=TRUE))

df <- df %>%
  mutate(group = case_when(.$b(contains("Bl")) ~ "Group1",
                 case_when(.$b(contains("re", ignore.case=TRUE)) ~ "Group2")
  )  
2
I believe contains is only to be used inside select. At least, that's what I gather from the documentation of ?contains.Rich Scriven
Thanks - yes I thought that might be true, but wasn't sure from the documentation. Seems like might be useful within mutate too, although the grep solution below is a good alternative.Peter MacPherson

2 Answers

32
votes

We can use grep

df %>%  
   mutate(group = case_when(grepl("Bl", b) ~ "Group1",
                            grepl("re", b, ignore.case = TRUE) ~"Group2"))
#    a     b  group
#1   1 Black Group1
#2   2 Green Group2
#3   3 Green Group2
#4   4 Green Group2
#5   5   Red Group2
#6   6 Green Group2
#7   7 Black Group1
#8   8 Black Group1
#9   9 Green Group2
#10 10 Green Group2
#11  1 Green Group2
#12  2 Green Group2
#13  3  Blue Group1
#14  4   Red Group2
#15  5  Blue Group1
#16  6   Red Group2
#17  7  Blue Group1
#18  8  Blue Group1
#19  9 Black Group1
#20 10 Black Group1
0
votes

Wanted to add some examples using str_detect with a paste0 function that would also make concatenating common groups a cinch. Say you're working with gapminder or an other country df.

interest <- c("Austria", "Belgium", "Bulgaria", "Croatia", "Cyprus",
              "Czech Republic", "Denmark", "Estonia", "Finland",
              "France", "Germany", "Greece", "Hungary", "Ireland",
              "Italy", "Latvia", "Lithuania", "Luxembourg","Malta",
              "The Netherlands", "Poland","Portugal", "Romania",
              "Slovakia", "Slovenia","Spain", "Sweden","United Kingdom")
EU <- paste0(countrycode::countryname(
  sourcevar = interest, destination = "iso2c"), 
  sep = "|", collapse = "")

df%<>%mutate(Region=case_when(
  str_detect(Country, "AT|BE|BG|HR|CY|CZ|DK|EE|FI|FR|DE|GR|HU|IE|
           IT|LV|LT|LU|MT|NL|PL|PT|RO|SK|SI|ES|SE|GB|UK|G8")~ "EU",
  TRUE ~ "Not EU")) ```

You'll need to load `library(magittr)` to get `%<>%` the compound pipe to work, it's basically an abbreviation of `df<-funs(df)`