I have a multi dimensional numpy array of shape (200, 1500). I want to visualise summary statistics for this data. Because the num_cols is too high I can't plot all of them. My questions are:
- Which summary statistics shall I visualise?
- Do i visualise all columns?
I thought of randomly choosing N columns from the data and showing distribution and box plots. Example shown below is for second column in array X. However, i can't figure out how to show both plots for N columns in a single figure. Can someone help me with this?
dist plot
plt.figure(figsize=(20,4)) plt.subplot(121)
ax = sns.distplot(X[:,1])Box Plot
plt.subplot(122) plt.xlim(X[:,1].min()*1.1, X[:,1].max()*1.1) sns.boxplot(x=X[:,1])