I set up the problem as follows:
1) I have n (in this case 2)-matrices in a list
2) each matrix has the same number of rows (508) but different number of columns
3) For each matrix in the list I want to group the data by "Cohort Number". I then want to move to column 3 ("Alpha(t-1)") and within each group that exists identify the row with the highest "Alpha(t-1)".
For example, say i have 3 groups as identified by "Cohort Number" (as occurs in Mat2 below). I group the data into subsets based on this number. Within all rows that correspond to cohort number =1, i look at column "Alpha (t-1)" and select the row with the highest value of "Alpha(t-1)". I store this row of data in Output1[[2]] (let Output1 be a list of output matrices). I store all remaining rows in Output2[[2]] (let Output2 be a list of all "remaining output" matrices). I then repeat the process for Cohort number = 2. I store the row with the highest alpha(t-1) in Output1[[2]], and all remaining rows in Output2... and so on and so forth. So Output1[[2]] would have 3 rows (and x columns depending on length of Mat2). Output2[[2]] would have 508-3 rows and x columns depending on length of Mat2. In this case Mat2 has 142 columns. so ncol(Output1[[2]])==ncol(Output2[[2]])=142. To reaffirm. Output1[[2]] is 3*142 and Output2[[2]] is 505*142.
To take a second example - Mat1. Mat1 is 508 *55. There is only one cohort number (Cohort Number ==1). This implies Output1[[1]] is 1*55 and Output2[[1]] is 507*55.
I need to apply this process for all matrices in the list. In my actual example i have over 100 matrices and would prefer not doing this one-by-one).
Setup:
mat1 <- data.frame(matrix(nrow=508, ncol =55))
colnames(mat1)[c(1:3,55)] <- c("ID", "Alpha(t)", "Alpha(t-1)", "Cohort Number")
mat1[,1] = 1:508
mat1[1:20,2] = rnorm(20, 0,1)
mat1[1:20,3] = rnorm(20,0,1)
mat1[1:20,55] = rep(1,20)
mat1[1:20, 4:54] = rnorm(20*(55-4),0,0.1)
mat2 <- data.frame(matrix(nrow=508, ncol =142))
colnames(mat2)[c(1:3,142)] <- c("ID", "Alpha(t)", "Alpha(t-1)", "Cohort Number")
seq3 <- seq(1,3,1)
mat2[,1] = 1:508
mat2[1:77,2] = rnorm(77, 0,1)
mat2[1:77,3] = rnorm(77,0,1)
mat2[1:77, 142] = c(rep(seq3,25),1,2)
mat2[1:77,4:141] = rnorm(77*(142-4),0,0.1)
list1 <- list(mat1,mat2)
Output1 <- c()
Output2 <- c()