1
votes

I ran a PCA with 9 variables and then wanted to run all possible linear models using the top 3 principal components. However, when I run the 8 different linear models the intercept and coefficient estimates are the exact same regardless of the dependent variables I use in the model.

I've already updated R and R Studio but still get the same results. If anyone has dealt with this issue before or has any suggestions, I would really appreciate the help. Thank you!

The code I used to get the principal component values and the linear models are below.

MOOPCA <- prcomp (MOOSE [, -1], cor = TRUE, scale = TRUE)
PCApredict <- predict(MOOPCA)
PC1 <- PCApredict[, 1]
PC2 <- PCApredict[, 2]
PC3 <- PCApredict[, 3]

Full <- lm(Density ~ PC1 + PC2 + PC3)
summary(Full)

MOO1 <- lm(Density ~ PC1)
summary(MOO1)

MOO2 <- lm(Density ~ PC1 + PC2)
summary(MOO2)

All models have the regression coefficients for intercept and PC1. Why?

1

1 Answers

2
votes

Principal components are orthogonal to each other, i.e., there is no linear correlation between them.

set.seed(0)
X <- matrix(runif(50), 10, 5)
pca <- prcomp(X, scale = TRUE)  ## no "cor" argument to `prcomp`
XO <- pca$x  ## or `XO <- predict(pca)`
round(crossprod(XO), 6)
#         PC1      PC2      PC3      PC4      PC5
#PC1 18.35253  0.00000 0.000000 0.000000 0.000000
#PC2  0.00000 11.24924 0.000000 0.000000 0.000000
#PC3  0.00000  0.00000 7.893672 0.000000 0.000000
#PC4  0.00000  0.00000 0.000000 4.180975 0.000000
#PC5  0.00000  0.00000 0.000000 0.000000 3.323583

Furthermore, they are orthogonal to intercept:

round(crossprod(cbind(1, XO)), 6)
#            PC1      PC2      PC3      PC4      PC5
#    10  0.00000  0.00000 0.000000 0.000000 0.000000
#PC1  0 18.35253  0.00000 0.000000 0.000000 0.000000
#PC2  0  0.00000 11.24924 0.000000 0.000000 0.000000
#PC3  0  0.00000  0.00000 7.893672 0.000000 0.000000
#PC4  0  0.00000  0.00000 0.000000 4.180975 0.000000
#PC5  0  0.00000  0.00000 0.000000 0.000000 3.323583

So if you fit a linear regression model ~ 1 + X0, coefficients will be invariant.