I am conducting a principal component analysis in R on vectors with missing data. I want to extract the score from the principal component and match the values with the observations that are not missing in the original frame but I can't figure out how to extract and match on the right identifiers. For example:
x1 <- c(1,2,3,NA, 5,6,7)
x2 <- c(7,NA,6,NA, 4,3,2)
frame <- cbind(x1,x2)
pca_ob<- princomp(~frame)
pca_ob$score[,1]
This produces the following output:
1 3 5 6 7
4.273146 2.104705 -0.715732 -2.125950 -3.536168
I would like to bind pca_ob$score[,1] with the original frame based on the identifiers and fill the rest in with NAs such that it produces the following matrix:
x1 x2 x3
1 1 7 4.273146
2 2 NA NA
3 3 6 2.104705
4 NA NA NA
5 5 4 -0.715732
6 6 3 -2.125950
7 7 2 -3.536168
This takes the output of the first set of scores and matches them back to the frame with NAs filling all spots where there isn't a pca score and matching on the variables for which there are scores.Any thoughts? Thanks.
goodFrame<-na.omit(frame)
and obtain the same pca scores from goodFrame, so R is dropping your missing data completely for the purpose of calculating PCA. – Paul