I'm looking to conduct a linear regression in R to model the effects of 5 independent variables on 376 columns of data.
I have a large matrix (541 rows and 402 columns) named 'dd' and I want to only plug in certain columns from the matrix as IVs and DVs in the regression. From dd, I want 376 specific columns to form my DVs and 5 columns to form my IVs. I have used the names of each column (for example 'column_42') as indices, separately for IV and DV:
IVind=paste0('column_',c(4,14,15,24,43)) #index for IV
DVind=paste0('column_',c(10:13, 17:18, 26, 28, 49:54, 58, 60, 1001:1180, 2001:2180)) #index for DV
IV <-(dd[,IVind]) #save independent variables in 'IV'
DV <-(dd[,DVind]) #save independent variables in 'DV'
I have tried plugging IV and DV into a linear regression like so:
try <- lm(DV~IV)
but have received the following error: Error in [[<-.data.frame
(*tmp*
, i, value = c(2113L, 2031L, 1971L, :
replacement has 203040 rows, data has 540
Is there anyway I can get around this error? I understand that it may be due to my IV and DV being saved in separate matrices?
I've tried to index dd directly in the regression function:
lm(dd[,DVind]~dd[,IVind])
only to receive the same error.
Any help is highly appreciated, thank you!