2
votes

I am trying to add two dataframes together that share the same row/column names, but have different number of columns/rows but am struggling.

Create Dataframes:

df1 = data_frame('Type' = 'Apple', '18/19' = 5)
df2 = data_frame('Type' = c('Apple', 'Pear', 'Banana'), 
                 '16/17' = c(4,5,6), '17/18' = c(0,2,5), '18/19' = c(2,6,7))


df1:

Type    18/19
Apple   5


df2:

Type    16/17   17/18   18/19
Apple      4       0       2
Pear       5       2       6
Banana     6       5       7



What I want to end up with is this:

dfFinal:

Type    16/17   17/18   18/19
Apple      4       0       7
Pear       5       2       6
Banana     6       5       7

I have tried:

dfFinal = merge(df1, df2, all=TRUE)

But that just creates two 'Apple' rows.

And also this:

dfFinal = aggregate(.~Type,rbind(df1,setNames(df2,names(df1))),sum)

But that just gives me an error of 'numbers of columns of arguments do not match'

dfFinal = cbind(df1[1], df1[-1] + df2[-1])

Gives me an error of '+ only defined for equally-sized data frames'

dfFinal = merge(data.frame(df1, row.names=NULL), data.frame(df2, row.names=NULL), 
                by = 0, all = TRUE)[-1]

breaks the 'type' column into two.



Any suggestions? This should be easy, but I am unable to get it to work.

2

2 Answers

2
votes

I am guessing like this? I am not sure if you want to order Type according to the order in df2..

library(dplyr)
library(tibble)
merge(df1, df2, all=TRUE) %>% group_by(Type) %>% summarise_all(sum,na.rm=TRUE)
# A tibble: 3 x 4
  Type   `18/19` `16/17` `17/18`
  <chr>    <dbl>   <dbl>   <dbl>
1 Apple        7       4       0
2 Banana       7       6       5
3 Pear         6       5       2

If you need to, then you have to do it

rowlvl <- df2$Type
collvl <- colnames(df2)
merge(df1, df2, all=TRUE) %>% select(collvl) %>% mutate(Type=factor(Type,levels=rowlvl)) %>%
group_by(Type) %>% summarise_all(sum,na.rm=TRUE)

# A tibble: 3 x 4
  Type   `16/17` `17/18` `18/19`
  <fct>    <dbl>   <dbl>   <dbl>
1 Apple        4       0       7
2 Pear         5       2       6
3 Banana       6       5       7
1
votes

The answer is probably a lot easier to see if you convert this data from wide to long and then merge.

This solution requires you to have tidyr version 1 installed.


library(tidyr)
library(dplyr)

df1 <- data_frame("Type" = "Apple", "18/19" = 5)
df2 <- data_frame(
  "Type" = c("Apple", "Pear", "Banana"),
  "16/17" = c(4, 5, 6), "17/18" = c(0, 2, 5), "18/19" = c(2, 6, 7)
)


df_final <- bind_rows(
  df1 %>%
    # pivoting to make the shapes of both data frames the same
    pivot_longer(
      cols = -Type,
      names_to = "years",
      values_to = "count"
    ),
  df2 %>%
    # pivoting to make the shapes of both data frames the same
    pivot_longer(
      cols = -Type,
      names_to = "years",
      values_to = "count"
    )
) %>%
  group_by(Type, years) %>%
  summarise(count = sum(count)) %>%
  # pivot again to convert back to wide format as answer required
  pivot_wider(
    names_from = years,
    values_from = count
  )