0
votes

I have several data frames named a32, a33,..., a63 in the namespace which I have to rbind to a single dataframe. Each has several (about 20) columns. They were supposed to have common column names but unfortunately a few have some columns missing. This leads to an error when I try to rbind them.

l <- 32:63
l<- as.character(l)       ## create a list of characters

A <- do.call(rbind.data.frame,mget(paste0("a",l))) ## "colnames not matching" error

Error in (function (..., deparse.level = 1, make.row.names = TRUE, stringsAsFactors = default.stringsAsFactors(),  : 
  numbers of columns of arguments do not match

I want to rbind them by only taking the common columns. I tried using paste0 inside a for loop to list column names for all dataframes and see which dataframes have missing columns but got nowhere. How can I avoid manually searching for missing columns by listing column names of each data frame one-by-one.

As a small example, say:

a32 <- data.frame(AB = 1, CD = 2, EF = 3, GH = 4)
a33 <- data.frame(AB = 6,         EF = 7)
a34 <- data.frame(AB = 8, CD = 9, EF = 10, GH = 11)
a35 <- data.frame(AB = 12,CD = 13,        GH = 14)
a36 <- data.frame(AB = 15,CD = 16,EF = 17,GH = 18)
and so on

Is there an efficient way to rbind all the 32 data frames in the namespace?

2
Does this answer your question: stackoverflow.com/a/8605132/14425671 - Aman
The number of data frames is large so the mget+Reduce+intersect seems more efficient to me. - user11254108

2 Answers

1
votes
  • Get dataframes in a list.
  • find out the common columns using Reduce + intersect
  • subset each dataframe from list with common columns
  • combine all the data together.
list_data <- mget(paste0("a",l))
common_cols <- Reduce(intersect, lapply(list_data, colnames))
result <- do.call(rbind, lapply(list_data, `[`, common_cols))

You can also make use of purrr::map_df which will make this shorter.

result <- purrr::map_df(list_data, `[`, common_cols)
1
votes

A base R solution:

# get names from workspace
dat_names <- ls()[grepl("a[0-9][0-9]", ls())]

# get data
df <- lapply(dat_names, get)

# get comman col
commen_col <- Reduce(intersect, sapply(df, FUN = colnames, simplify = TRUE))

# selet and ribind
dat <- lapply(df, FUN = function(x, commen_col) x[, c(commen_col)], commen_col=commen_col)
dat <- do.call("rbind", dat)
colnames(dat) <- commen_col
dat

# AB
# [1,]  1
# [2,]  6
# [3,]  8
# [4,] 12
# [5,] 15