3
votes
library(purrr)
library(tibble)
library(dplyr)

Starting list of dataframes

lst <- list(df1 = data.frame(X.1 = as.character(1:2),
                             heading = letters[1:2]),
            df2 =  data.frame(X.32 = as.character(3:4),
                              another.topic = paste("Line ", 1:2)))

lst
#> $df1
#>   X.1 heading
#> 1   1       a
#> 2   2       b
#> 
#> $df2
#>   X.32 another.topic
#> 1    3       Line  1
#> 2    4       Line  2

Expected "combined" dataframe, with new consistent variable names, and old variable names in the first row of each constituent dataframe.

#>    id   h1            h2
#> 1 df1  X.1       heading
#> 2 df1    1             a
#> 3 df1    2             b
#> 4 df2 X.32 another.topic
#> 5 df2    3       Line  1
#> 6 df2    4       Line  2

add_row requires "Name-value pairs, passed on to tibble(). Values can be defined only for columns that already exist in .data and unset columns will get an NA value."

Which is what I think I have achieved with this:

df_nms <- 
  map(lst, names) %>% 
  map(set_names)

#> $df1
#>       X.1   heading 
#>     "X.1" "heading" 
#> 
#> $df2
#>            X.32   another.topic 
#>          "X.32" "another.topic"

But I cannot tie up the last bit, using a purrr function to add the names to the head of each dataframe. I've tried numerous variations with map2 and pmap the closest I can get at present (if I treat add_row as a formula , prefixing it with ~ and remove the .y I get a new first row populated with NAs). I think I'm missing how to pass the name-value pairs to the add_row function.

map2(lst, df_nms, add_row(.x, .y, .before = 1)) %>% 
  map(set_names, c("h1", "h2")) %>% 
  map_dfr(bind_rows, .id = "id")
#> Error in add_row(.x, .y, .before = 1): object '.x' not found

A pointer to resolve this last step would be most appreciated.

3

3 Answers

3
votes

Not quite sure how to do this via purrr map functions, but here is an alternative,

library(dplyr)
bind_rows(lapply(lst, function(i){d1 <- as.data.frame(matrix(names(i), ncol = ncol(i))); 
                                  rbind(d1, setNames(i, names(d1)))}), .id = 'id')

#   id   V1            V2
#1 df1  X.1       heading
#2 df1    1             a
#3 df1    2             b
#4 df2 X.32 another.topic
#5 df2    3       Line  1
#6 df2    4       Line  2
2
votes

I altered your sample data a bit, setting stringsAsFactors to FALSE when creating the data.frames in lst.

here is a solution using data.table::rbindlist().

#sample data
lst <- list(df1 = data.frame(X.1 = as.character(1:2),
                             heading = letters[1:2], 
                             stringsAsFactors = FALSE),   # !! <--
            df2 =  data.frame(X.32 = as.character(3:4),
                              another.topic = paste("Line ", 1:2),
                              stringsAsFactors = FALSE)   # !! <--
            )

DT <- data.table::rbindlist( lapply( lst, function(x) rbind( names(x), x ) ), 
                             use.names = FALSE, idcol = "id" )
setnames(DT, names( lst[[1]] ), c("h1", "h2") ) 

#     id   h1            h2
# 1: df1  X.1       heading
# 2: df1    1             a
# 3: df1    2             b
# 4: df2 X.32 another.topic
# 5: df2    3       Line  1
# 6: df2    4       Line  2
2
votes

Here's an approach using map, rbindlist from data.table and some base R functions:

library(purrr)
library(dplyr)
library(data.table)
map(lst, ~ as.data.frame(unname(rbind(colnames(.x),as.matrix(.x))))) %>%
  rbindlist(idcol = "id")
#    id   V1            V2
#1: df1  X.1       heading
#2: df1    1             a
#3: df1    2             b
#4: df2 X.32 another.topic
#5: df2    3       Line  1
#6: df2    4       Line  2

Alternatively we could use map_df if we use colnames<-:

map_df(lst, ~ as.data.frame(rbind(colnames(.x),as.matrix(.x))) %>%
         `colnames<-`(.,paste0("h",seq(1,dim(.)[2]))), .id = "id")
#   id   h1            h2
#1 df1  X.1       heading
#2 df1    1             a
#3 df1    2             b
#4 df2 X.32 another.topic
#5 df2    3       Line  1
#6 df2    4       Line  2

Key things here are:

  1. Use as.matrix to get rid of the factor / character incompatibility.
  2. Remove names with unname or set them with colnames<-
  3. Use the idcols = or .id = feature to get the names of the list as a column.