2
votes

I would like to use a switch statement within dplyr's mutate. I have a simple function that performs some operations and assigns alternative values via switch, for example:

convert_am <- function(x) {
    x <- as.character(x)
    switch(x,
           "0" = FALSE,
           "1" = TRUE,
           NA)
}

This works as desired when applied to scalars:

>> convert_am(1)
[1] TRUE
>> convert_am(2)
[1] NA
>> convert_am(0)
[1] FALSE

I would like to arrive at equivalent results via mutate call:

mtcars %>% mutate(am = convert_am(am))

This fails:

Error inmutate_impl(.data, dots) : Evaluation error: EXPR must be a length 1 vector.

I understand that this is because values passed to switch ar not single, as in example:

convert_am(c(1,2,2)) Error in switch(x, 0 = FALSE, 1 = TRUE, NA) : EXPR must be a length 1 vector

Vectorization

Attempt to vectorize also yield the desired results:

convert_am <- function(x) {
    x <- as.character(x)

    fun_switch <- function(x) {
        switch(x,
               "0" = FALSE,
               "1" = TRUE,
               NA)
    }

    vf <- Vectorize(fun_switch, "x")
}

>> mtcars %>% mutate(am = convert_am(am))
Error in mutate_impl(.data, dots) : 
  Column `am` is of unsupported type function

Notes

  • I'm aware of case_when in dplyr and I'm not interested in using it, I'm only interested in making switch work inside mutate
  • Ideal solution would allow for further expansion to use mutate_at with variables passed as .
2
I think you need to Vectorize convert_am instead of fun_switch? Try eg mtcars %>% mutate(am = Vectorize(convert_am)(am)). What you've done there returns a function vf (see ?Vectorize) - konvas
@konvas Fair point, feel free to post a working solution. It’s more fun than anything, I reckon that with all the wrappers a lot of efficiency derived from switching things will be lost but I want to have it done just to have a working example. - Konrad
Indeed, this is not efficient at all, you may as well use do and not vectorize at all. I would try to use case_when since that's what it's there for but suppose you have your reasons for not wanting to use it :) - konvas
@konvas Purely educational ones. I’m using case fairly frequently and wanted to make a more frequent use of switch. - Konrad

2 Answers

5
votes

switch is not vectorized so for efficiency you need to use ifelse or case_when - but as your question is specifically about switch, you can achieve what you want by vectorizing, e.g.

convert_am <- Vectorize(function(x) {
    x <- as.character(x)
    switch(x,
       "0" = FALSE,
       "1" = TRUE,
       NA)
})

or

convert_am <- function(x) {
    x <- as.character(x)
    sapply(x, function(xx) switch(xx,
       "0" = FALSE,
       "1" = TRUE,
       NA))
}

They are both inefficient as they involve a loop under the hood.

1
votes
This is simple enough to handle with ifelse directly:
        
            Test <- tibble::tibble(
              am = c(-1:5,NA,1, 0)
            ) 
        
            Test %>%
              mutate(
                newam = ifelse(am == 1, TRUE,
                       ifelse(am == 0, FALSE, NA))
              )
        
        
With more categories, use a named vector:
            Test %>%
              mutate(
                newam = ifelse(is.na(am) | !am %in% c(1,3,5), NA,
                               c("1" = "in1", "3" = "in3", "5" = "in5")[as.character(am)])
              )
        
In fact if the value is not in the named list it will default to an NA
I think this will be pretty efficient
            Test %>%
              mutate(
                newam = c("1" = "in1", "3" = "in3", "5" = "in5")[as.character(am)]
              )