How to subtract data frames with different fill values

Question

I need to subtract two Data Frames with different indexes (which causes 'NaN' values when one of the values is missing) and I want to replace the missing values from each Data Frame with different number (fill value). For example, let's say I have df1 and df2:

df1:

    A   B   C
0   0   3   0
1   0   0   4
2   4   0   2

df2:

    A   B   C
0   0   3   0
1   1   2   0
3   1   2   0

subtracted = df1.sub(df2):

    A   B   C
0   0   0   0
1   -1  -2  4
2   NaN NaN NaN
3   NaN NaN NaN

I want the second row of subtracted to have the values from the second row in df1 and the third row of subtracted to have the value 5.

I expect -

subtracted:

    A   B   C
0   0   0   0
1   -1  -2  4
2   4   0   2
3   5   5   5

I tried using the method sub with fill_value=5 but than in both rows 2 and 3 I'll get 0.

I understand why you want the second row to be as if df2 had a second row with 0, but what is the rational behind the 5s in the third row? (if it has nothing to do with the original values, you can just assign to that row after .sub) — Adam.Er8

yatu yatu · Accepted Answer · 2019-07-01T10:19:25

One way would be to reindex df2 setting fill_value to 0 before subtracting, then subtract and fillna with 5:

ix = pd.RangeIndex((df1.index|df2.index).max()+1)
df1.sub(df2.reindex(ix, fill_value=0)).fillna(5).astype(df1.dtypes)

   A  B  C
0  0  0  0
1 -1 -2  4
2  4  0  2
3  5  5  5

How to subtract data frames with different fill values

3 Answers