1
votes

I have two DataFrames (example below). I would like to delete any row in df1 with a value equal to df2[patnum] if df2[city] is 'nan'.

For example: I would want to drop rows 2 and 3 in df1 since they contain '4' and patnum '4' in df2 has a missing value in df2['city'].

How would I do this?

    df1
    Citer  Citee
0   1      2
1   2      4
2   3      5
3   4      7

    df2
    Patnum City
0   1      new york
1   2      amsterdam
2   3      copenhagen
3   4      nan
4   5      sydney

expected result:
    df1
    Citer  Citee
0   1      2
1   3      5

1
Do you mean to drop rows 1 and 3 (assuming zero-based indexing)?cs95
Please put the exact expected output.Mayank Porwal

1 Answers

0
votes

IIUC stack isin and dropna

the idea is to return a True/False boolean based on matches then drop those rows after we unstack the dataframe.

val = df2[df2['City'].isna()]['Patnum'].values

df3 = df1.stack()[~df1.stack().isin(val)].unstack().dropna(how="any")

   Citer  Citee
0    1.0    2.0
2    3.0    5.0

Details

df1.stack()[~df1.stack().isin(val)]

0  Citer    1
   Citee    2
1  Citer    2
2  Citer    3
   Citee    5
3  Citee    7
dtype: int64

print(df1.stack()[~df1.stack().isin(val)].unstack())


   Citer  Citee
0    1.0    2.0
1    2.0    NaN
2    3.0    5.0
3    NaN    7.0