1
votes

Suppose I have the two following data frames df_1 and df_2 and want to merge them with an additional column which states if the row is present in df_1,df_2 or both. Is there any way I could do this directly while using merge?

df_1

Data Frame 1

df_2

enter image description here

The data frame I would like to have:

enter image description here

1
If my answer was helpful, don't forget accept it - click on the check mark () beside the answer to toggle it from greyed out to filled in. Thanks. - jezrael

1 Answers

3
votes

Use parameter indicator in merge and for expected output add map by dict:

df = pd.merge(df1, df2, how='outer', indicator='origin')
print (df)
   Index   x  y      origin
0     20   5  8   left_only
1     35   4  7   left_only
2     55   3  2        both
3     60  12  1        both
4     22   4  3  right_only
5     30  12  6  right_only

d = {'left_only':'1', 'right_only':'2', 'both':'1 & 2'}
df['origin'] = df['origin'].map(d)
print (df)
   Index   x  y origin
0     20   5  8      1
1     35   4  7      1
2     55   3  2  1 & 2
3     60  12  1  1 & 2
4     22   4  3      2
5     30  12  6      2