I have a data frame named “dat” with 10 numeric variables (var1, var2,var3,var4 , var5,…var 10), each with several observations…
dat
var1 var2 var3 var4 var5 var6 var7 var8 var9 var10
1 12 5 18 19 12 17 11 16 18 10
2 3 2 10 6 13 17 11 16 18 10
3 13 15 14 13 1 17 11 16 18 10
4 17 11 16 18 10 17 11 16 18 10
5 9 13 8 8 7 17 11 16 18 10
6 15 6 20 17 3 17 11 16 18 10
7 12 5 18 19 12 17 11 16 18 10
8 3 2 10 6 13 17 11 16 18 10
9 13 15 14 13 1 17 11 16 18 10
...
I would like to write a code to repeat the same function for all the variables (except the first) in a data frame. The function should analyse the linear regression between var 1 and all the other variables (var2, var3, var4, var5) each at time, using the lm() function
e.g. cycle 1: linear regression between var 1 and var 2
lm(var1~var2, data=dat)
cycle 2: linear regression between var 1 and var 3,
lm(var1~var3, data=dat)
cycle 3: linear regression between var 1 and var 4
lm(var1~var4, data=dat)
and so on…
I would also like that the results from each cycle will be saved in a new data frame named “results”, having the following structure
Var_tested Correlation_coefficient P_value_correlation R_squared
Var2 corr_coeff_var2 p_value_var2 R_sq_var2
Var3 corr_coeff_var3 p_value_var3 R_sq_var3
Var4 corr_coeff_var4 p_value_var4 R_sq_var4
With each rows reporting data the results of each correlation. Is it possible?
Thank you so much for your help!