2
votes

After performing a principal component analysis of a first data set (a), I projected a second data set (b) into PCA space of the first data set.

From this, I want to extract the variable loadings for the projected analysis of (b). Variable loadings of the PCA of (a) are returned by prcomp(). How can I retrieve the variable loadings of (b), projected into PCA space of (a)?

# set seed and define variables
set.seed(1)
a = replicate(10, rnorm(10))
b = replicate (10, rnorm(10))

# pca of data A and project B into PCA space of A
pca.a = prcomp(a)
project.b = predict(pca.a, b)

# variable loadings
loads.a = pca.a$rotation
1
The loadings are specified by the original PCA. The PC scores, however, will be different. Is that what you want?Lyngbakr
So the same rotation matrix is applied to (b), in the projection? I guess that means that project.b contains the principal components of the projected data frame.user 123342
Yep. So, predict is just calculating the PC scores based on the loadings matrix and new data. If you look at project.b you'll see each column refers to a PC.Lyngbakr
Great, thanks for clearing that up. Clearly revealing my ignorance here. If you write that as an answer I can mark the question as resolved.user 123342

1 Answers

10
votes

Here's an annotated version of your code to make it clear what is happening at each step. First, the original PCA is performed on matrix a:

pca.a = prcomp(a)

This calculates the loadings for each principal component (PC). At the next step, these loadings together with a new data set, b, are used to calculate PC scores:

project.b = predict(pca.a, b)

So, the loadings are the same, but the PC scores are different. If we look at project.b, we see that each column corresponds to a PC:

            PC1         PC2         PC3        PC4         PC5          PC6         PC7         PC8
 [1,] -0.2922447  0.10253581  0.55873366  1.3168437  1.93686163  0.998935945  2.14832483 -1.43922296
 [2,]  0.1855480 -0.97631967 -0.06419207  0.6375200 -1.63994127  0.110028191 -0.27612541 -0.37640710
 [3,] -1.5924242  0.31368878 -0.63199409 -0.2535251  0.59116005  0.214116915  1.20873962 -0.64494388
 [4,]  1.2117977  0.29213928  1.53928110 -0.7755299  0.16586295  0.030802395  0.63225374 -1.72053189
 [5,]  0.5637298  0.13836395 -1.41236348  0.2931681 -0.64187233  1.035226594  0.67933996 -1.05234872
 [6,]  0.2874210  1.18573157  0.04358772 -1.1941734 -0.04399808 -0.113752847 -0.33507195 -1.34592414
 [7,]  0.5629731 -1.02835365  0.36218131  1.4117908 -0.96923175 -1.213684882  0.02221423  1.14483112
 [8,]  1.2854406  0.09373952 -1.46038333  0.6885674  0.39455369  0.756654205  1.97699073 -1.17281174
 [9,]  0.8573656  0.07810452 -0.06576772 -0.5200661  0.22985518  0.007571489  2.29289637 -0.79979214
[10,]  0.1650144 -0.50060018 -0.14882996  0.2065622  2.79581428  0.813803739  0.71632238  0.09845912
              PC9      PC10
 [1,] -0.19795112 0.7914249
 [2,]  1.09531789 0.4595785
 [3,] -1.50564724 0.2509829
 [4,]  0.05073079 0.6066653
 [5,] -1.62126318 0.1959087
 [6,]  0.14899277 2.9140809
 [7,]  1.81473300 0.0617095
 [8,]  1.47422298 0.6670124
 [9,] -0.53998583 0.7051178
[10,]  0.80919039 1.5207123

Hopefully, that makes sense, but I'm yet to finish my first coffee of the day, so no guarantees.