0
votes

I have a dataframe with 376 columns and 2700 rows, with 270 rows corresponding to each subject (hence, there are 10 subjects in this case). Every 270 rows are a single subject's data (i.e 1:270 - subject 1; 271:540 - subject 2).

I have a separate dataframe (2700 x 8) with my independent variables, once again 270 rows per subject.

I'm looking to regress out the 8 IVs from my 376 DVs and obtain the residuals. The tricky part for me here is that I want to conduct the regression separately per subject (i.e a separate regression for every 270 rows). Here is some simulated data below:

DV = matrix(rnorm(2700*376),ncol=376) #construct matrix for dependent variables

IV =  matrix(rnorm(2700*8),ncol=8) #matrix for independent variables

To obtain the residuals from the data as a whole, I would simply do

resid = residuals(lm(DV~IV))

But how would I adapt this to make it so that the regression is done per subject (for every 270 rows)? Is it easier if my IV and DV data frames were combined?

I'm new to R and any help is much appreciated, thank you.

1
Hi Aleya, it is hard to help without a reprex, but your case is actually quite well explained, I usually follow the approach used by this book r4ds.had.co.nz/many-models.htmlBruno
Hi, why does your dependent variable have 376 columns? I would expect one column of y-values?Valeri Voev
Hi Valeri, I have several dependent variables, and I want to put all of them into the model simultaneously. So each column is one DV.Aleya Marzuki

1 Answers

0
votes

A suggestion using a for loop and base R:

result_list <- NULL
first <- 1
for (i in 1:10) {
  last <- i * 270
  this_subjects_results <- residuals(lm(DV[first:last,] ~ IV[first:last,]))
  result_list <- list(result_list, this_subjects_results)
  first <- last + 1
}