If you want to see not null summary of each column , just use df.info(null_counts=True):
Example 1:
df = pd.DataFrame(np.random.randn(10,5), columns=list('abcde'))
df.iloc[:4,0] = np.nan
df.iloc[:3,1] = np.nan
df.iloc[:2,2] = np.nan
df.iloc[:1,3] = np.nan
df.info(null_counts=True)
output:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 a 6 non-null float64
1 b 7 non-null float64
2 c 8 non-null float64
3 d 9 non-null float64
4 e 10 non-null float64
dtypes: float64(5)
memory usage: 528.0 bytes
In addition, if you want to customize the result , such as add nan_rate , I wrote a method
def describe_nan(df):
return pd.DataFrame([(i, df[df[i].isna()].shape[0],df[df[i].isna()].shape[0]/df.shape[0]) for i in df.columns], columns=['column', 'nan_counts', 'nan_rate'])
describe_nan(df)
>>> column nan_counts nan_rate
0 a 4 0.4
1 b 3 0.3
2 c 2 0.2
3 d 1 0.1
4 e 0 0.0