Add (not merge!) two data frames with unequal rows and columns

Question

I want to efficiently sum the entries of two data frames, though the data frames are not guaranteed to have the same dimensions or column names. Merge isn't really what I'm after here. Instead I want to create an output object with all of the row and column names that belong to either of the added data frames. In each position of that output, I want to use the following logic for the computed value:

If a row/column pairing belongs to both input data frames I want the output to include their sum
If a row/column pairing belongs to just one input data frame I want to include that value in the output
If a row/column pairing does not belong to any input matrix I want to have 0 in that position in the output.

As an example, consider the following input data frames:

df1 = data.frame(x = c(1,2,3), y = c(4,5,6))
rownames(df1) = c("a", "b", "c")
df2 = data.frame(x = c(7,8), z = c(9,10), w = c(2, 3))
rownames(df2) = c("a", "d")
> df1
  x y
a 1 4
b 2 5
c 3 6
> df2
  x  z  w 
a 7  9  2
d 8 10  3

I want the final result to be

> df2
   x  y   z  w
a  8  4   9  2
b  2  5   0  0
c  3  6   0  0
d  8  0  10  3

What I've done so far -

bind_rows / bind_cols in dplyr can throw the following: "Error: incompatible number of rows (3, expecting 2)"

I have duplicated column names, so 'merge' isn't working for my purposes either - returns an empty df for some reason.

eipi10 eipi10 · Accepted Answer · 2016-02-02T20:35:02

Seems like you could merge on the rownames, then take care of the sums and conversion of NA to zero with some additional munging:

library(dplyr)

df.new = df1 %>% add_rownames %>%
  full_join(df2 %>% add_rownames, by="rowname") %>%
  mutate_each(funs(replace(., which(is.na(.)), 0))) %>%
  mutate(x = x.x + x.y) %>%
  select(rowname,x,y,z,w)

Or, with @DavidArenburg's much more elegant and extensible solution:

df.new = df1 %>% add_rownames %>% 
  full_join(df2 %>% add_rownames) %>% 
  group_by(rowname) %>% 
  summarise_each(funs(sum(., na.rm = TRUE)))

df.new

  rowname     x     y     z     w
1       a     8     4     9     2
2       b     2     5     0     0
3       c     3     6     0     0
4       d     8     0    10     3

Add (not merge!) two data frames with unequal rows and columns

5 Answers