2
votes

I have a set of data that is accessible on:

http://textuploader.com/df5nt

There are 4 columns in my data set corresponding to 4 different features. I can calculate the first and second principal components using this code:

import pandas as pd
from sklearn.decomposition import PCA as sklearnPCA
from sklearn.preprocessing import StandardScaler

data = pd.read_csv('rr.txt')
X = data.ix[:,0:4].values
X_std = StandardScaler().fit_transform(X)
sklearn_pca = sklearnPCA(n_components=2)
Y_sklearn = sklearn_pca.fit_transform(X_std)

print (Y_sklearn)

Now I want to draw the features weights for these data. Something like this: features weight

I know I need to use the explained variance ratio in scikit-learn but I could not figure out how to implement it in my code to get it. I was hoping if someone could help me with that. Thanks!

1

1 Answers

2
votes

use components_ attribute

http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html

feature_weights = sklearn_pca.components_
import matplotlib.pyplot as plt
%matplotlib inline
fig, (ax1,ax2) = plt.subplots(1,2)
ax1.bar(range(4),feature_weights[0],color=['b','g','r','c'])
ax2.bar(range(4),feature_weights[1],color=['b','g','r','c'])

enter image description here