I'm currently trying to run a loop performing linear regression for multiple independent variables (n = 6) with multiple dependent variables (n=1000).
Here is some example data, with age, sex, and education representing my independent variables of interest and testscore_* being my dependent variables.
df = data.frame(ID = c(1001, 1002, 1003, 1004, 1005, 1006,1007, 1008, 1009, 1010, 1011),
age = as.numeric(c('56', '43','59','74','61','62','69','80','40','55','58')),
sex = as.numeric(c('0','1','0','0','1','1','0','1','0','1','0')),
testscore_1 = as.numeric(c('23','28','30','15','7','18','29','27','14','22','24')),
testscore_2 = as.numeric(c('1','3','2','5','8','2','5','6','7','8','2')),
testscore_3 = as.numeric(c('18','20','19','15','20','23','19','25','10','14','12')),
education = as.numeric(c('5','4','3','5','2', '1','4','4','3','5','2')))
I have working code that allows me to run a regression model for multiple DVs (which I'm sure more experienced R users will dislike for its lack of efficiency):
y <- as.matrix(df[4:6])
#model for age
lm_results <- lm(y ~ age, data = df)
write.csv((broom::tidy(lm_results)), "lm_results_age.csv")
regression_results <-broom::tidy(lm_results)
standardized_coefficients <- lm.beta(lm_results)
age_standardize_results <- coef(standardized_coefficients)
write.csv(age_standardize_results, "lm_results_age_standardized_coefficients.csv")
I would then repeat this all by manually replacing age
with sex
and education
Does anyone have a more elegant way of running this - for example, by way of a loop for all IVs of interest (i.e. age, sex and education)?
Also would greatly appreciate anyone who would suggest a quick way of combining broom::tidy(lm_results)
with standardized coefficients from lm.beta::lm.beta
, i.e. combining the standardized regression coefficients with the main model output.