Select variables from list of tibbles using names in each tibble

Question

I have a list of tibbles or data frames (each one with a name in the list) and I want to: (1) create new columns with the same name long.col in each tibble from variables with different names; (2) do this by matching from another tibble with the key linking tibble name and column name in each tibble and finally; (3) binding rows for all the newly created columns with the same name long.col in each tibble as well as identifying the original tibble where they came from.

I would like to do this preferably using tidyverse functions. Here is a an example of both: a) the list of tibbles; b) the key tibble identifying tibble names and variables to select in each one

df1 <- tibble(v1 = c(rep("A", 5), rep("B", 5)),
          v2 = 1:10)
df2 <- tibble(v1 = c(rep("C", 6), rep("D", 6)),
          v3 = 11:22)
df3 <- tibble(v1 = c(rep("E", 4), rep("F", 4)),
          v4 = 23:30)

list.df <- list(df1, df2, df3)
names(list.df) <- c("data1", "data2", "data3")

key <- tibble(data = c("data1", "data2", "data3"),
          vars = c("v2", "v3", "v4"))

The final output should look like:

 final.df <- tibble(data = c(rep("data1", 10), rep("data2", 12), rep("data3", 8)),
               long.col = 1:30)

I need to do this in a much longer list using multiple columns, so it is not feasible to do it separately for each column in each tibble.

markus markus · Accepted Answer · 2018-11-25T07:45:14

You can use map2 here

library(purrr)
library(tibble)
out <- map2_df(.x = list.df,
               .y = names(list.df),
               .f = ~ {
                 temp <- key[["vars"]][key[['data']] == .y]
                 tibble(data = .y, long.col = .x[[temp]])
                 })

Check the output

identical(final.df, out)
#[1] TRUE

Select variables from list of tibbles using names in each tibble

2 Answers