I'm trying to create a function that will compare variables 1 and 2 and create a third variable based on whether they match. I need to do this >25 times (for different combinations of variables), which is why I want to create a function instead of just using mutate and case_when.
I'm pretty new to R, so this is mostly cobbled together from other helpful stack overflow posts and miscellaneous tutorials.
Here's what I tried:
determine_match <- function(df, col_a, col_b){
col_a <- enquo(col_a)
col_b <- enquo(col_b)
newvar <- paste0(quo_name(col_a), quo_name(col_b))
df <- df %>% mutate(!!newvar:= case_when(
!!col_a == '1' & !!col_b =='Yes' ~ 'Match',
!!col_a == '0' & !! col_b == 'No' ~ 'Match',
!!col_a == '1' & !!col_b == 'No' ~ 'No Match',
!!col_a == '0' & !!col_b == 'Yes' ~ 'No Match',
is.na(!!col_a) | is.na(!!col_b) ~ NA_character_,
TRUE ~ 'Error'
))
}
And I tested it on this data set:
test1 <- c('1', '0', '1', '1', '0', NA)
test2 <- c('Yes', 'No', 'No,', NA, 'Yes', NA)
id <- c(1,2,3,4,5,6)
testing.df <- data.frame(id, test1, test2)
I'm not getting errors, but when I run the function with a print statement, it only returns the string name for newvar and doesn't change the actual data frame.
I also tried testing.df %>% mutate(testing3 = funs(determine_match(testing.df, testing1, testing2))) and testing3 gives me ~determine_match(testing.df, testing1, testing2)
Not sure if the problem is the function, the attempt to apply, or both.
Hope some kind soul can help, thank you!!
result.df <- determine_match(testing.df, test1, test2)andresult.dfisn't what you expect? - Gregor Thomasreturn(df)at the end of your function... though you could probably simplify the code - Gregor Thomas