4
votes

I'm not sure of the best way to ask this question.

I would like to mutate using case_when (or if_else if that works better) to examine if a value exists in any of a range of columns.

E.g. in mtcars I would like to check if any of the columns vs, am, gear or carb contained 1 or 2 and set a new variable newVar to 1 if they do. I could do the following:

mtcars %>%
  mutate(newVar = case_when(vs %in% c(1, 2) | am %in% c(1, 2) | gear %in% c(1, 2) | carb %in% c(1, 2) ~ 1,
                            TRUE ~ 0))

Is there a prettier way to do this? I want to check across 10+ columns so it gets long. Something like:

mtcars %>%
  mutate(newVar = case_when(c(vs, am, gear, carb) %in% c(1, 2) ~ 1,
                            TRUE ~ 0))
2

2 Answers

3
votes

I think base R can work good here. Select columns for which you want to check and take row wise sum of logical vector to calculate newVar.

df <- mtcars 
cols <- c("vs", "am", "gear", "carb")
df$newVar <- +(rowSums(df[cols] == 1 | df[cols] == 2) > 0)

df
#                     mpg cyl  disp  hp drat    wt  qsec vs am gear carb newVar
#Mazda RX4           21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4      1
#Mazda RX4 Wag       21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4      1
#Datsun 710          22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1      1
#Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1      1
#Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2      1
#Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1      1
#Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4      0
#Merc 240D           24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2      1
#Merc 230            22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2      1
#Merc 280            19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4      1
#....

We can also use apply for row-wise manipulation

df$newVar <- +(apply(df[cols] == 1 | df[cols] == 2, 1, any))
1
votes

We can use tidyverse option to create the column

library(dplyr)
library(purrr)
mtcars %>%
      mutate(newVar = select(., vs:carb) %>%
                        map(~ .x %in% 1:2) %>% 
                        reduce(`|`) %>% 
                        as.integer)
#.   mpg cyl  disp  hp drat    wt  qsec vs am gear carb newVar
#1  21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4      1
#2  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4      1
#3  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1      1
#4  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1      1
#5  18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2      1
#6  18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1      1
#7  14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4      0
#8  24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2      1
# ...

Or with base R

nm1 <- c("vs", "am", "gear", "carb")
mtcars$newVar <- +(Reduce(`|`, lapply(mtcars[nm1], `%in%`, 1:2)))