1
votes

I am a beginner in R. I have two matrices with the same number of rows (let's say 10), and many columns. I want to do a linear regression, using glm, between eacg row of matrixA and the corresponding row of matrixB. I want to print the residuals in a new matrix, which will have the same number of rows as the original matrices:

matrixA <- read.table("fileA.txt", header=TRUE, row.names=1)
matrixB <- read.table("fileB.txt", header=TRUE, row.names=1)

for(i in 1:10) {
    response = as.matrix(matrixA)[i,]
    predictor = as.matrix(matrixB)[i,]
    fit <- glm(response ~ predictor)
    residuals[i,] <- fit$residuals
}

However, I am getting this error:

Error in residuals[1, ] <- fit$residuals : 
  incorrect number of subscripts on matrix

I looked up this error a bit, and thought that maybe it did not recognize fit$residuals as a vector, so I tried to specify it (as.vector(fit$residuals)), but that did not fix it.

Any idea on how I can fix this? Thank you!

Format of the matrices (both have the same format)

    a   b   c   d   f
A   1.0 2.0 3.0 4.0 5.0
B       …
C
D
E
F
G
H
I
J
1
Can I ask why my question is being downvoted?arielle
Where does residuals come from? Do you really want it to be 2 Dimensional?Benjamin Mohn
You can, but only the downvoter can answer. However, I can tell you I (and others) would be more willing to help if you provided data also. See here: stackoverflow.com/questions/5963269/…sindri_baldur
I am not sure that I understand your question. Do you mean "residuals[i,] <- ..."? I am just naming the new matrix "residuals" because it will contain residuals values...arielle
I posted the format of my data. Sorry I just didn't see how this would helparielle

1 Answers

3
votes

You would need to preallocate your output vector. However, it's easier/cleaner to use mapply. If you pass it two vectors (including lists) it iterates simultaneously over both and applies the function to the paired elements. Thus we only need to split the matrices into lists.

A <- matrix(1:9, 3)
B <- A * 3 + 2 + 1/A

t(mapply(function(a, b) {
  fit <- lm(b ~ a)
  residuals(fit)
}, split(A, letters[1:3]), split(B, letters[1:3])))
#           1           2          3
#a 0.10714286 -0.21428571 0.10714286
#b 0.03750000 -0.07500000 0.03750000
#c 0.01851852 -0.03703704 0.01851852

residuals(lm(B[1,] ~ A[1,]))
#        1          2          3 
#0.1071429 -0.2142857  0.1071429 

Here is a for loop that does the same:

result <- matrix(NA, nrow = nrow(A), ncol = ncol(A))
for (i in seq_len(nrow(A))) {
  result[i,] <- residuals(lm(B[i,] ~ A[i,]))
}
#           [,1]        [,2]       [,3]
#[1,] 0.10714286 -0.21428571 0.10714286
#[2,] 0.03750000 -0.07500000 0.03750000
#[3,] 0.01851852 -0.03703704 0.01851852