R, Union two dataframe with different column names and no match

Question

I want to combine two distinct dataframes(completely different columns) in R, into one inclusive data frame.

Lets say data frame "df_a" has columns A & B:

df_a <- read.table(header=TRUE, text='
    A   B
    1   x1
    2   y1
    3   z1
')

And dataframe "df_b" has columns C & D.

df_b <- read.table(header=TRUE, text='
          C     D
          1     6.7
          1     4.5
          1     3.7
          2     3.3
          2     4.1
          2     5.2
')

Therefore the resultant dataframe "df_c" will have columns A,B,C,D, see below:

df_c
  A     B       C   D
1 1     x1      1   6.7
2 2     y1      1   4.5
3 3     z1      1   3.7
4 NA    NA      2   3.3
5 NA    NA      2   4.1
6 NA    NA      2   5.2

Approach #1:

I first tried using rbind() but that function requires matching column names, however that is not what I'm looking for.

Approach #2:

I used df_c <- merge(df_a,df_b), however merge seems to be doing a Cartesian product, see below:

df_c <-  merge(df_a,df_b)
df_c
   A  B C   D
1  1 x1 1 6.7
2  2 y1 1 6.7
3  3 z1 1 6.7
4  1 x1 1 4.5
5  2 y1 1 4.5
6  3 z1 1 4.5
7  1 x1 1 3.7
8  2 y1 1 3.7
9  3 z1 1 3.7
10 1 x1 2 3.3
11 2 y1 2 3.3
12 3 z1 2 3.3
13 1 x1 2 4.1
14 2 y1 2 4.1
15 3 z1 2 4.1
16 1 x1 2 5.2
17 2 y1 2 5.2
18 3 z1 2 5.2

Approach #3:

Than I used df_c <- union(df_a,df_b), but the result is no longer a data frame. Its turned into a list of lists, see below:

[[1]]
[1] 1 2 3

[[2]]
[1] x1 y1 z1
Levels: x1 y1 z1

[[3]]
[1] 1 1 1 2 2 2

[[4]]
[1] 6.7 4.5 3.7 3.3 4.1 5.2

Approach #4

I created my own function called unionNoMatch(), that attempts to append columns from df_2 to df_1 input paramters:

unionNoMatch <- function(df_1, df_2)
{
df_3 <- df_1; 
for (name in names(df_2))
{
cbind(df_2$name,df_3) 
}

return (df_3);

}

df_c <- unionNoMatch (df_a,df_b)

However, I got this error:

Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 0, 3

How can I achieve my task of combining 2 data frames with non matching columns into a single data frame?

Thanks

So you literally have no way of matching the rows in df_a to those of df_b even though the number of rows differs? — MichaelChirico
This is yet another cbind.fill question, one of which was asked earlier today and closed as a duplicate of one of the many other versions of this question. Just google cbind.fill — joran

jeremycg jeremycg · Accepted Answer · 2015-08-18T01:15:03

R treats variables on the same row as related, so it doesn't want to put things on the same row unless it is told you want them there. In general, this is to prevent mistakes. If you know what you are doing, you can manually give an id to merge by:

df_a$row <- 1:nrow(df_a)
df_b$row <- 1:nrow(df_b)
merge(df_a, df_b, by = "row", all = TRUE)

R, Union two dataframe with different column names and no match

3 Answers