I have two dataframes where multiple operations are to be implemented, for example:
old_DF
id col1 col2 col3
-------------------------
1 aaa
2 bbb 123
new_DF
id col1 col2 col3
-------------------------
1 xxx 999
2 xxx kkk
The following operations need to be performed on these dataframes:
- Merging the two dataframes
- Replacing only the blanks (NAs) cells in the old_DF with corresponding values from new_DF
- Cells from both the dataframes where the values are contradicting should be reported in a new dataframe
Desired results:
updated_df
id col1 col2 col3
-------------------------
1 aaa xxx 999
2 xxx bbb 123
conflicts_df
id col1 col2 col3
-------------------------
2 bbb
2 kkk
I can use .append() method to join the two dataframes and I guess one can use .bfil() or .ffil() methods to fill in the missing values. But I am unsuccessful with both .bfil() and .ffil(). I have tried df.groupby('id').apply(lambda x: x.ffill().bfill()).drop_duplicates() but I do not get the desired results. Additionally, I do not understand how to perform step 3 mentioned above. Is there anyone who can help with this problem?