I'm using factanal in R to reduce a 30 variable dataset down to 7 factors, then using the factor scores outputted by this process (from fa$scores) in an lm model. So far, so straightforward....
However, the independent variables I'm using are lagged one period vs my dependent (as the model is hopefully going to predict the future). I now have all 30 input variables I need to predict the value of next periods dependent var, so my question is this. How do I use the factanal output from the work I've already done to calculate the 7 factor scores from these 30 new variables? Once I have these, I can use the lm model to predict the next period.
Example of the code I'm using below (target var is in the first column of mydata):
#extract factors
fitted_data <- factanal(mydata[,-1],7,rotation="varimax",lower=0.05,scores="regression")
#add factor scores back to main dataset
mydata <- cbind(mydata,fitted_data$scores)
'#inear regression model to predict my target_variable using factors I've extracted
mod1 <- lm(Target_Var ~ Factor1+ Factor2 + Factor3 + Factor4 + Factor5 + Factor6 + Factor7,data=mydata)
I have the latest 30 independent variables in a dataset called "new_data", and I'm just looking to calculate the 7 factor scores using the factor loadings already calculated, but can't for the life of me figure out how.....
Any help greatly appreciated.
dput(head(mydata))
&dim(mydata)
would be useful for replication – Jonny Phelps