1
votes

I'm doing a principal component analysis, after I got the analysis result, how to identify the first couple of principal predictors? As it is messy from the plot. It's hard to see the predictors names:
enter image description here

Which part of the PCA results should I look into? This is more like how to determine the most important predictors which could explain, lets' say 80%, of the variance of your data. We know, e.g, the first 5 component did this, while the principal component is just combination of predictors. How to identify those "important" predictors.

1
Please provide a reproducible example when you're asking a question. The code you use to run the pca is more important the biplot generated. Also, please define what you mean by 'first couple of principle predictors'.Adam Quek
@Adam Quek,This is more like how to determine the most important predictors which could explain, lets' say 80%, of the variance of your data. We know, e.g, the first 5 component did this. While the principal component is just combination of predictors. How to identify those "important" predictors. Is that clear?Demo

1 Answers

1
votes

See this answer Principal Components Analysis - how to get the contribution (%) of each parameter to a Prin.Comp.?

The information is stored within your pca results. If you used prcomp(), then $rotation is what you are after, or if you used princomp(), then $loadings holds the key. Eg.

require(graphics)
data("USArrests")

pca_1<-prcomp(USArrests, scale = TRUE)
load_1<-with(pca_1,unclass(rotation))
aload_1<-abs(load_1)
sweep(aload_1, 2, colSums(aload_1), "/")
#               PC1       PC2       PC3        PC4
#Murder   0.2761363 0.2540139 0.1890303 0.40186493
#Assault  0.3005008 0.1141873 0.1485443 0.46016113
#UrbanPop 0.1433452 0.5301651 0.2094067 0.08286886
#Rape     0.2800177 0.1016337 0.4530187 0.05510509


pca_2<-princomp(USArrests,cor=T)
load_2<-with(pca_2,unclass(loadings))
aload_2<-abs(load_2)
sweep(aload_2, 2, colSums(aload_2), "/")

#            Comp.1    Comp.2    Comp.3     Comp.4
#Murder   0.2761363 0.2540139 0.1890303 0.40186493
#Assault  0.3005008 0.1141873 0.1485443 0.46016113
#UrbanPop 0.1433452 0.5301651 0.2094067 0.08286886
#Rape     0.2800177 0.1016337 0.4530187 0.05510509

As you can see, Murder, Assault, and Rape each contribute ~30% to PC1, whereas UrbanPop only contributes ~14% to PC1, yet is the major contributor to PC2 (~53%).