what you can do is ->
from pandas import plotting as pdplt
pdplt.scatter_matrix(dataframe, pass the remaining required parameters)
by this you will get a matrix of plots(in your case it's 6X6) which will exactly show how each column in your dataframe relates to the other columns and you can clearly visualise which feature dominates the result and also how the features are correlated to each other.
If you ask me this is the first thing I used to do with such types of problems and then remove all correlated features and select the features which best approximate the output.
But as you have to plot a 2d plot and in the above approach you might get more than a single feature which dominate the output then what you can do is a miracle named PCA.
If you ask me PCA is one of the most beautiful thing in machine learning. What it will do that is somehow merges all your feautres in some magical ratio which will generate principle components for your data. Principal components are those components which govern/major contribution to your model. You apply pca by simply importing from sklearn and then select the first principle component(as you need a 2d plot) or might select 2 priciple components and plot a 3d graph. But always remember this that these pricipal components are not the real features of your model but they are some magical combination and how PCA did so is very very interesting(by using concepts like eigen values and vectors) and you can build by your own also.
Apart from all these you can apply Singular Value decomposition(SVD) to your model which is the essence of whole linear algebra which is a type of matrix decomposition existing for all matrix. What this do is decompose your matrix into three matrix out of which the diagonal matrix which consists of singular values(a scaling factor) in descending order and what you have to do is that select the top singular values (in your case only the first one having highest magnitude) and construct back a feature matrix from 5 columns to 1 columns and then plot that. You can do svd by using the numpy.linalg
Once you applied any one of these methods then what you can do is learn your hypothesis with only the single most important selected feature and finally plot the graph. But take a tip, just for plotting a 2d graph you should avoid other important features beacuse maybe you have 3 principal components all having almost the same contribution and may the top three singular values are very close to each other. So take my words and take all important features into account and if you need the visualisation of these important features then use scatter matrix
Summary ->
All I want to mention is that you can do the same process with all these things and also can invent your own statistical or mathematical model for compressing your feature space.
But for me I prefer to go with PCA and in such type of problems I even first plot the scatter matrix to get an visual intuition to the data. And also PCA and SVD helps to remove redundancy and hence overfitting.
For rest details refer to docs.
Happy machine learning...