I have two dataframes df1 and df2 and would like to merge them to a df3, based on the values in one of the columns, as shown below.
Please, how do I accomplish this?
df1:
+---+----+
| | b |
+---+----+
| 1 | 3 |
| 2 | 4 |
| 3 | 7 |
| 4 | 8 |
| 5 | 10 |
+---+----+
df2:
+---+-------+-----+
| | x | y |
+---+-------+-----+
| 3 | True | 5.4 |
| 3 | False | 6.9 |
| 4 | True | 9.8 |
| 7 | True | 7.8 |
| 8 | False | 5.6 |
+---+-------+-----+
df3:
+---+---+--------+-----+
| | b | y_notx | y_x |
+---+---+--------+-----+
| 1 | 3 | 6.9 | 5.4 |
| 2 | 4 | NaN | 9.8 |
| 3 | 7 | NaN | 7.8 |
| 4 | 8 | 5.6 | NaN |
+---+---+--------+-----+
The code:
import pandas as pd
t1 = {'b': [3, 4, 7, 8, 10]}
df1 = pd.DataFrame(t1, index=[1,2,3,4,5])
t2 = {'x' : [True, False, True, True, False],
'y' : [5.4,6.9,9.8,7.8,5.6]}
df2 = pd.DataFrame(t2, index=[3,3,4,7,8])
t3 = {'b': [3, 4, 7, 8],
'y_x': [5.4, 9.8, 7.8, pd.np.nan],
'y_notx': [6.9, pd.np.nan, pd.np.nan, 5.6]}
df3 = pd.DataFrame(t3, index=[1, 2, 3, 4])
df3? The termbdetermines which index to look for underdf2and returns and x True term undery_xand a x False term undery_notx. And thebcolumn is determined bydf1? Correct? - akozi