I would like to find the importance of each feature in my dataframe using Scikit learn.
I am trying to use it in Scikit learn instead of using Info Gain via WEKA software which provide the score and the feature name next to it.
I implemented the next method, but I don't know how to replace the ranking number in score.
For example:
I don't want to see:
- feature 6
- feature 4
...
However, I prefer:
0.4 feature 6
0.233 feature 4
...
Here is my method:
def _rank_features(self, dataframe, targeted_class):
from sklearn.feature_selection import RFE
from sklearn.linear_model import LinearRegression
feature_names = list(dataframe.columns.values)
# use linear regression as the model
lr = LinearRegression()
# rank all features, i.e continue the elimination until the last one
rfe = RFE(lr, n_features_to_select=1)
rfe.fit(dataframe, targeted_class)
print "Features sorted by their rank:"
print sorted(zip(map(lambda x: round(x, 4), rfe.ranking_), feature_names))
Is someone know how to convert from ranking into score?