0
votes

I have two dataframes of different shapes. I want to fill in missing data in my df1 from data that exists in df2.

How do I join these two datasets while keeping the original shape and columns of df1?

I have tried using pd.merge, but I don't think I am getting the syntax right. I have created new columns in the dataframe, but I'm not able to only add data to the NaN values.

I have also tried using combine first, but I don't think I'm doing that right either.

df1 = pd.DataFrame({'a': ["dogs","cats","birds","turtles"], 'b': [1,5,"NA",10]})
print(df1)

df2 = pd.DataFrame({'a': ["birds"],'b': [6]})
print(df2)

df_Final = pd.DataFrame({'a': ["dogs","cats","birds","turtles"], 'b': [1,5,6,10]})
print(df_Final)

I expect the output to be the df_Final dataframe shown here, where the "birds" value, is populated with df2.

1

1 Answers

0
votes

fuelbaby

How about this ?

df1['b'] = df1['b'].where(df1['b']!=('NA'), df1['a'].map(df2.set_index('a')['b']))

Out[166]: 
         a   b
0     dogs   1
1     cats   5
2    birds   6
3  turtles  10