3
votes

I've been working my way through the R purrr package, and I've come to a roadblock. I created some mock data below which represents a very small snippet of what my data actually looks like.

library(tidyverse)

my_data <- tribble(
  ~lookup_lists, ~old_vectors,

  # Observation 1
  list(
    "X1" = "one",
    "X7" = "two", 
    "X16" = "three"
  ), 

  c("Col1", "Col2", "Col3", "X1", "X7", "X16"),

  # Observation 2
  list(
    "X3" = "one",
    "X8" = "two", 
    "X22" = "three"
  ), 

  c("Col1", "Col2", "Col3", "X3", "X8", "X22")
)

At this point, I want to make a new column that has the same vector values as old_vectors but the values that start with X are recoded to reflect the lookup named list in the lookup_lists. For example, I want the first row to go from:

c("Col1", "Col2", "Col3", "X1", "X7", "X16")

to

c("Col1", "Col2", "Col3", "one", "two", "three")

and be saved to a new column in the nested tibble. Here is my attempt using the map2 function:

# Add a third column that has the recoded vectors

my_data <- my_data %>%
  mutate(new_vectors = map2(.x = old_vectors, .y = lookup_lists, .f = ~recode(.x, .y)))

#> Error in mutate_impl(.data, dots): Evaluation error: Argument 2 must be named, not unnamed.

I don't understand this because the second argument IS named. Here is the first observation's lookup_list to show my point:

my_data$lookup_lists[[1]]
$X1
[1] "one"

$X7
[1] "two"

$X16
[1] "three"

I think I'm missing something pretty obvious, and probably has something to do with this. Any help would be greatly appreciated!

2

2 Answers

3
votes

As the 'lookup_lists' is a named list, we can unlist it to a named vector, use that to match the elements in the 'old_vectors' and replace with the values that matches the 'key' with the elements in 'old_vector'. The ones that are not matching will be NA. Remove that with na.omit and concatenate with the 'Col' elements (using grep) in the 'old_vectors'

out <- my_data %>% 
           mutate(new_vectors = map2(old_vectors, lookup_lists,
         ~ c(grep('Col', .x, value = TRUE), unname(na.omit(unlist(.y)[.x])))))
out$new_vectors
#[[1]]
#[1] "Col1"  "Col2"  "Col3"  "one"   "two"   "three"

#[[2]]
#[1] "Col1"  "Col2"  "Col3"  "one"   "two"   "three"
2
votes

It doesn't work because recode doesn't work that way. To understand what happens it helps to simplify your example:

x <- my_data[["old_vectors"]]
y <- my_data[["lookup_lists"]]
recode(x[[1]], y[[1]])
## Error: Argument 2 must be named, not unnamed

As described in ?recode, the function expects not a named list of replacements, but a series of named arguments. That is, instead of recode(x[[1]], y[[1]]) it wants

recode(x[[1]], X1 = "one", X7 = "two", X16 = "three")
## [1] "Col1"  "Col2"  "Col3"  "one"   "two"   "three"

This situation is common, and there is a standard approach to it:

invoke(recode, .x = y[[1]], x[[1]])
## [1] "Col1"  "Col2"  "Col3"  "one"   "two"   "three"

Now that we know how to pass a named list of arguments to a function that expects multiple (possibly named) arguments, we can apply this knowledge to solve the original problem:

my_data <- my_data %>%
    mutate(new_vectors = map2(.x = old_vectors, .y = lookup_lists,
                              .f = ~invoke(recode, .x = .y, .x)))