13
votes

I know that there are many related questions here on SO, but I am looking for a purrr solution, please, not one from the apply list of functions or cbind/rbdind (I want to take this opportunity to get to know purrr better).

I have a list of dataframes and I would like to add a new column to each dataframe in the list. The value of the column will be the name of the dataframe, i.e. the name of each element in the list.

There is something similar here, but it involves the use of a function and mutate_each(), whereas I need just mutate().

To give you an idea of the list (called comentarios), here is the first line of str() on the first element:

> str(comentarios[1])
List of 1
 $ 166860353356903_661400323902901:'data.frame':    13 obs. of  7 variables:

So I would like my new variable to contain 166860353356903_661400323902901 for 13 lines in the result, as an ID for each dataframe.

What I am trying is:

dff <- map_df(comentarios, 
              ~ mutate(ID = names(comentarios)),
              .id = "Group"
              )

However, mutate() needs the name of the dataframe in order to work:

Error in mutate_(.data, .dots = lazyeval::lazy_dots(...)) : 
  argument ".data" is missing, with no default

It doesn't make sense to put in each name, I'd be straying into loop territory and losing the advantages of purrr (and R, more generally). If the list was smaller, I'd use reshape::merge_all(), but it has over 2000 elements. Thanks in advance for any help.

edit: some data to make the problem reproducible, as per alistaire's comments

# install.packages("tidyverse")
library(tidyverse)
df <- data_frame(one = rep("hey", 10), two = seq(1:10), etc = "etc")

list_df <- list(df, df, df, df, df)
names(list_df) <- c("first", "second", "third", "fourth", "fifth")
dfs <- map_df(list_df, 
              ~ mutate(id = names(list_df)),
              .id = "Group"
              )
2
You need to make your example reproducible by adding data.alistaire
I don't think that's necessary here, alistaire, it's a question about syntax more than anything, as Jake's answer showed.RobertMyles
It is always necessary, or the question will be closed. How to Askalistaire
My mistake, fixed now.RobertMyles
Better, though you should show your desired output, as well. Assuming a bit, you can just do dplyr::bind_rows(list_df, .id = 'id').alistaire

2 Answers

19
votes

Your issue is that you have to explicitly provide reference to the data when you're not using mutate with piping. To do this, I'd suggest using map2_df

dff <- map2_df(comentarios, names(comentarios), ~ mutate(.x, ID = .y)) 
5
votes

using the OP's data the answer would be

library(tidyverse)
df <- data_frame(one = rep("hey", 10), two = seq(1:10), etc = "etc")

list_df <- list(df, df, df, df, df)
dfnames <- c("first", "second", "third", "fourth", "fifth")

dfs <- list_df %>% map2_df(dfnames,~mutate(.x,name=.y))