Using switch statement within dplyr's mutate

Question

I would like to use a switch statement within dplyr's mutate. I have a simple function that performs some operations and assigns alternative values via switch, for example:

convert_am <- function(x) {
    x <- as.character(x)
    switch(x,
           "0" = FALSE,
           "1" = TRUE,
           NA)
}

This works as desired when applied to scalars:

>> convert_am(1)
[1] TRUE
>> convert_am(2)
[1] NA
>> convert_am(0)
[1] FALSE

I would like to arrive at equivalent results via mutate call:

mtcars %>% mutate(am = convert_am(am))

This fails:

Error inmutate_impl(.data, dots) : Evaluation error: EXPR must be a length 1 vector.

I understand that this is because values passed to switch ar not single, as in example:

convert_am(c(1,2,2)) Error in switch(x, 0 = FALSE, 1 = TRUE, NA) : EXPR must be a length 1 vector

Vectorization

Attempt to vectorize also yield the desired results:

convert_am <- function(x) {
    x <- as.character(x)

    fun_switch <- function(x) {
        switch(x,
               "0" = FALSE,
               "1" = TRUE,
               NA)
    }

    vf <- Vectorize(fun_switch, "x")
}

>> mtcars %>% mutate(am = convert_am(am))
Error in mutate_impl(.data, dots) : 
  Column `am` is of unsupported type function

Notes

I'm aware of case_when in dplyr and I'm not interested in using it, I'm only interested in making switch work inside mutate
Ideal solution would allow for further expansion to use mutate_at with variables passed as .

I think you need to Vectorize convert_am instead of fun_switch? Try eg mtcars %>% mutate(am = Vectorize(convert_am)(am)). What you've done there returns a function vf (see ?Vectorize) — konvas
@konvas Fair point, feel free to post a working solution. It’s more fun than anything, I reckon that with all the wrappers a lot of efficiency derived from switching things will be lost but I want to have it done just to have a working example. — Konrad
Indeed, this is not efficient at all, you may as well use do and not vectorize at all. I would try to use case_when since that's what it's there for but suppose you have your reasons for not wanting to use it :) — konvas
@konvas Purely educational ones. I’m using case fairly frequently and wanted to make a more frequent use of switch. — Konrad

konvas konvas · Accepted Answer · 2017-10-18T15:57:11

switch is not vectorized so for efficiency you need to use ifelse or case_when - but as your question is specifically about switch, you can achieve what you want by vectorizing, e.g.

convert_am <- Vectorize(function(x) {
    x <- as.character(x)
    switch(x,
       "0" = FALSE,
       "1" = TRUE,
       NA)
})

or

convert_am <- function(x) {
    x <- as.character(x)
    sapply(x, function(xx) switch(xx,
       "0" = FALSE,
       "1" = TRUE,
       NA))
}

They are both inefficient as they involve a loop under the hood.

Using switch statement within dplyr's mutate

Vectorization

Notes

2 Answers