3
votes

I have a pandas dataframe with a pair of columns where, on every row, one cell is a nan and the other is not. It looks like this:

Var1 Var2
0 3 nan
1 8 nan
2 nan 6
3 4 nan
4 nan 2
5 nan 6

I would like to merge these two columns in one without the nans:

Var1
0 3
1 8
2 6
3 4
4 2
5 6

Any ideas?

5
What about df.max(1)?Henry Yik

5 Answers

2
votes

Use fillna, setting values parameter to another Series.

df['Var1'].fillna(df['Var2'], inplace=True)
df['Var1']

#    Var1
# 0  3
# 1  8
# 2  6
# 3  4
# 4  2
# 5  6
0
votes

You might harness numpy for this task following way

import numpy as np
import pandas as pd
df = pd.DataFrame({'v1':[np.nan,2,np.nan,4],'v2':[1,np.nan,3,np.nan]})
df['v'] = np.where(np.isnan(df.v1),df.v2,df.v1)
print(df)

output

    v1   v2    v
0  NaN  1.0  1.0
1  2.0  NaN  2.0
2  NaN  3.0  3.0
3  4.0  NaN  4.0
0
votes

You can stack the frame which will drop the NaNs by default. Then you can construct a new frame with its values:

>>> new_df = pd.DataFrame(df.stack().to_numpy(), columns=["Var1"])
>>> new_df

   Var1
0   3.0
1   8.0
2   6.0
3   4.0
4   2.0
5   6.0
0
votes

Another version:

df = df.transform(sorted, key=pd.isna, axis=1).dropna(axis=1, how="all")
print(df)

Prints:

      ID  var1
0  103.0  11.0
1  104.0  16.0
2  105.0  71.0
3  106.0   2.0
3  107.0   1.0
0
votes

Another method - if you only have two columns.

df.assign(Var1=df.sum(axis=1))[['Var1']]

   Var1
0   3.0
1   8.0
2   6.0
3   4.0
4   2.0
5   6.0