11
votes

I tried to find the answer in the official Pandas documentation, but found it more confusing than helpful. Basically I have two dataframes with overlapping, but not identical column lists:

df1:
   A   B
0  22  34
1  78  42

df2:
   B   C
0  76  29
1  11  67

I want to merge/concatenate/append them so that the result is

df3:
   A   B   C
0  22  34  nan
1  78  42  nan
2  nan 76  29
3  nan 11  67

Should be fairly simple, but I've tried several intuitive approaches and always got errors. Can anybody help me?

3

3 Answers

11
votes

If you just want to concatenate the dataframes you can use.

pd.concat([df1,df2])

output:

      A   B     C
0  22.0  34   NaN
1  78.0  42   NaN
0   NaN  76  11.0
1   NaN  11  67.0

Then you can reset_index to recreate a simple incrementing index.

pd.concat([df,df2]).reset_index(drop = True)

Output:

      A   B     C
0  22.0  34   NaN
1  78.0  42   NaN
2   NaN  76  11.0
3   NaN  11  67.0
10
votes

You need merge with parameter how = outer

df3 = df1.merge(df2, how = 'outer')

    A       B   C
0   22.0    34  NaN
1   78.0    42  NaN
2   NaN     76  29.0
3   NaN     11  67.0
0
votes

Both @vaishali and @scott-boston solution work. Prefer the merge function as it allow more flexibility on your result with the how parameter. Howerver concat can achieve better performance if few columns are involved

To optimize @scott-boston answer, you can also use the internal concat parameter igonore_index that automatically resize the index without calling another function the code would be like :

pd.concat([df1,df2],ignore_index=True)

Output

      A   B     C
0  22.0  34   NaN
1  78.0  42   NaN
2   NaN  76  11.0
3   NaN  11  67.0

Python (version 3.8.5) | pandas(version 1.1.3)