3
votes

I have a problem with the pca in R, probably a simple one:

I have 10 Vectors a,b,c,d,e,f,g,h,i,j and bind them with cbind.

With the Result I perform a pca, using prcomp. I get the scores all right and also I get the principal components in descending order.

Only: how on earth do I know which of the components a to j is the first, which the second and so on?

Probably really a beginner's question - still cannot solve it and would appreciate some help.

2
There's no one-to-one correspondence between original dimensions and principal components -- that's the point of PCA :) The help for prcomp tells you what components its return value has; these should help you to figure out which original dimension contributes to which principal component.Lars Kotthoff
but - where is the point if in the end you do not know which components of the original data set are most important? I thought the whole idea was to find them?user1862770
What you want to do sounds more like feature selection. Principal component analysis can help you to identify important features, but its purpose is really to transform the feature space.Lars Kotthoff
Each principal component is a linear combination of all your original components. For example prcomp(...)$rotation[,"PC1"] shows you how much each of your original dimensions contributes to the first component. So it is not a one-to-one map like you thought but more like a many-to-many mapping.flodel

2 Answers

4
votes

The rotation matrix can tell you which original variables are important in each of the principal components. For example, the first column of the rotation matrix shows the contributions for PC1. A high value in the first row (relative to the other coefficients) means that the first original variable is important in the first principal component. Let's say that the first column has high positive values for the first five rows, and high negative values for the second five rows. This means that the PC axis can be interpreted as the ratio between those two groups.

3
votes

It's a old question... but maybe someone needs it in the future

library(stats)
data(USArrests) 
PCA.USA <- prcomp(USArrests[,c(1,2,4)], scale=TRUE) 
proporcionDeInfluencia <- abs(PCA.USA$rotation)
sweep(proporcionDeInfluencia, 2, colSums(proporcionDeInfluencia), "/")

More info in Principal Components Analysis - how to get the contribution (%) of each parameter to a Prin.Comp.?