18
votes

plyr::mapvalues can be used like this:

mapvalues(mtcars$cyl, c(4, 6, 8), c("a", "b", "c"))

But this doesn't work:

mtcars %>%
dplyr::select(cyl) %>%
mapvalues(c(4, 6, 8), c("a", "b", "c")) %>%
as.data.frame()

How can I use plyr::mapvalues with dplyr? Or even better, what the dplyr equivalent?

3
Try mtcars %>% select(cyl) %>% .$cyl %>% plyr::mapvalues(c(4,6,8), c('a', 'b', 'c'))%>% as.data.frame() - akrun
Or mtcars %>% mutate(x = mapvalues(cyl, c(4, 6, 8), c("a", "b", "c"))) %>% select(x) - Rich Scriven
That works. What does .$cyl do? - luciano
You could use mtcars %>% transmute(cyl = factor(cyl, labels = c("a", "b", "c"))) similarly - talat
@luciano You could change the previous code to mtcars %>% .$cyl %>% plyr::mapvalues(c(4,6,8), c('a', 'b', 'c')) %>% data.frame(cyl=.) - akrun

3 Answers

20
votes

2020 Update: plyr is now a "retired" package and its official guidance suggests using the actively-improved and maintained dplyr package instead. So it's preferable to use only dplyr, in this case dplyr::recode() as in the other answer, and avoid plyr entirely.

To use plyr::mapvalues() with dplyr:

To use it and return a one-column data.frame:

mtcars %>%
  transmute(cyl = plyr::mapvalues(cyl, c(4, 6, 8), c("a", "b", "c")))

Or if you want a single vector output, like in your working example, use pull:

mtcars %>%
  pull(cyl) %>%
  plyr::mapvalues(., c(4, 6, 8), c("a", "b", "c"))

If you are using both dplyr and plyr simultaneously, see this note from the dplyr readme:

You'll need to be a little careful if you load both plyr and dplyr at the same time. I'd recommend loading plyr first, then dplyr, so that the faster dplyr functions come first in the search path. By and large, any function provided by both dplyr and plyr works in a similar way, although dplyr functions tend to be faster and more general.

Though note that you can call mapvalues using plyr::mapvalues if dplyr is loaded without needing to load plyr.

11
votes

As the questions also mentions

Or even better, what the dplyr equivalent?

The equivalent is recode.

http://www.cookbook-r.com/Manipulating_data/Renaming_levels_of_a_factor/

name <- c("John", "Clara", "Smith")
sex <- c(1,2,1)
age <- c(30,32,54)
df <- data.frame(name,sex,age)
df %>% mutate(sex=recode(sex,
`1`="Male",
`2`="Female"))

This will "mapvalues" 1 to Male and 2 to Female.

0
votes

I was a heavy plyr::mapvalues() user. I used it for replacing old values in strings with new ones. Something like:

set.seed(1)
data <- data.frame(name = sample(letters[1:5], 100, replace = TRUE))
check_list <- data.frame(old = letters[1:5], new = LETTERS[1:5])

data$name
#> [1] "a" "d" "a" "b" "e" "c" "b" "c" "c" "a" "e" "e" "b" "b"

plyr::mapvalues(data$name, check_list$old, check_list$new)
#> [1] "A" "D" "A" "B" "E" "C" "B" "C" "C" "A" "E" "E" "B" "B" ...

Please correct me if I am wrong, but there is not an equally short and tidy dplyr way of doing this. You can still do it with dplyr::recode(), however:

dplyr::recode(data$name, !!!setNames(check_list$new, check_list$old))
#> [1] "A" "D" "A" "B" "E" "C" "B" "C" "C" "A" "E" "E" "B" "B" ...

As it says in the documentation, the order for the named vector is old (name) = new (value), which is the opposite to dplyr::mutate() and dplyr::rename() functions (when writing, might have been fixed later).

Adding this as an answer because I keep Googling how to do it when I forget and could not find the answer quickly. Perhaps now I can. The solution is modified from the last two lines of Examples in the function documentation.