1
votes

Many functions can perform Principal Component Analysis (PCA) on raw data in R. By raw data I understand any data frame or matrix whose rows are indexed by observations and whose columns are identified with measurements. Can we carry out PCA on a correlation matrix in R ? Which function can accept a correlation matrix as its input in R ?

1
Have a look at this question. princomp can take a covmat input argument (but it's the covariance matrix not the correlation matrix) instead of the initial dataframe.Lamia

1 Answers

3
votes

As mentioned in the comments, you can use

ii <- as.matrix(iris[,1:4])
princomp(covmat=cor(ii))

This will give you equivalent results to princomp(iris,cor=TRUE) (which is not what you want - the latter uses the full data matrix, but returns the value computed when the covariance matrix is converted to a correlation).


You can also do all the relevant computations by hand if you have the correlation matrix:

cc <- cor(ii)
e1 <- eigen(cc)

Standard deviations:

sqrt(e1$values)
[1] 1.7083611 0.9560494 0.3830886 0.1439265

Proportion of variance:

e1$values/sum(e1$values)
[1] 0.729624454 0.228507618 0.036689219 0.005178709

You can get the loadings via e1$vectors. Compute the scores (according to this CV question) via as.matrix(iris) %*% e1$vectors) (this will not give numerically identical answers to princomp()$scores - the eigenvectors are scaled differently - but it gives equivalent results).