4
votes

I want to merge two pandas data frames that share the same index as well as some columns. pd.merge creates duplicate columns, but I would like to merge on both axes at the same time.

tried pd.merge and pd.concat but did not get the right result.

my try: df3=pd.merge(df1, df2, left_index=True, right_index=True, how='left')

df1

     Var#1  Var#2  Var#3  Var#4  Var#5  Var#6  Var#7
ID                                                  
323      7      6      8    7.0    2.0    2.0   10.0
324      2      1      5    3.0    4.0    2.0    1.0
675      9      8      1    NaN    NaN    NaN    NaN
676      3      7      2    NaN    NaN    NaN    NaN

df2

     Var#6  Var#7  Var#8  Var#9
ID                             
675      1      9      2      8
676      3      2      0      7

ideally I would get:

df3

     Var#1  Var#2  Var#3  Var#4  Var#5  Var#6  Var#7 Var#8   Var#9
ID                                                  
323      7      6      8    7.0    2.0    2.0   10.0  NaN    NaN
324      2      1      5    3.0    4.0    2.0    1.0  NaN    NaN
675      9      8      1    NaN    NaN    1      9    2      8
676      3      7      2    NaN    NaN    3      2    0      7
2
thanks a lot, that did the trick. I was not aware of that method. - Nicolas

2 Answers

3
votes

IIUC, use df.combine_first():

df3=df1.combine_first(df2)
print(df3)

      Var#1  Var#2  Var#3  Var#4  Var#5  Var#6  Var#7  Var#8  Var#9
ID                                                                
323      7      6      8    7.0    2.0    2.0   10.0    NaN    NaN
324      2      1      5    3.0    4.0    2.0    1.0    NaN    NaN
675      9      8      1    NaN    NaN    1.0    9.0    2.0    8.0
676      3      7      2    NaN    NaN    3.0    2.0    0.0    7.0
2
votes

You can concat and group the data

pd.concat([df1, df2], 1).groupby(level = 0, axis = 1).first()

    Var#1   Var#2   Var#3   Var#4   Var#5   Var#6   Var#7   Var#8   Var#9
ID                                  
323 7.0     6.0     8.0     7.0     2.0     2.0     10.0    NaN     NaN
324 2.0     1.0     5.0     3.0     4.0     2.0     1.0     NaN     NaN
675 9.0     8.0     1.0     NaN     NaN     1.0     9.0     2.0     8.0
676 3.0     7.0     2.0     NaN     NaN     3.0     2.0     0.0     7.0