Subsetting matrix using numerical index - subset command and for loop with error

Question

I have the following matrix "m" (nrow=2504, ncol=2) with two columns; one called ind (from index) and the other called headerline (IDs of samples):

> head(m)
     ind headerline
[1,] "1" "HG00096" 
[2,] "2" "HG00097" 
[3,] "3" "HG00099" 
[4,] "4" "HG00100" 
[5,] "5" "HG00101" 
[6,] "6" "HG00102" ...

And the following index vector called "index" (nr=385, nc=1):

> head(index)
  V1
1  1
2  4
3  9
4 12
5 13
6 16 ...

I want to subset the samples in the row positions marked by index (I want a new matrix with sample in row 1, sample in row 4, sample in row 9 and so forth). I came up with the following code:

 for i in index { dudosos<-subset(headerline,ind==i, select=c(headerline)) }

but it yields the following error:

Error: unexpected symbol in "for i"

I don't know what that error is telling me, it's too vague. Help? Thanks!

Desired output:

> head(m)               #or other name
         ind headerline
         "1" "HG00096"   
         "4" "HG00100" 
         "9" ...

emilliman5 emilliman5 · Accepted Answer · 2017-02-25T13:09:52

You can do this all in base:

m <- matrix(c("1", "2", "3", "4", "5", "6", "7", "8", "9", 
              "HG00096", "HG00097", "HG00098", "HG00099", "HG00100", "HG00101","HG00102", "HG00103", "HG00103"), ncol=2)
index <- c("1", "4", "9") 

m[m[, 1] %in% index, ]

This or @Gin_Salmon's answer are the best way to achieve your goals...

This is an explanation of why your code was not working:

There are a few problems with your code:
1. Your for loop interation needs to be in (): for (i in index){ ... }
2. your subset command should read: subset(as.data.frame(m), ind == i, select = headerline)
3. Your loop overwrites dudosos at each iteration
dudosos[i, ] <- subset(m, ind == i, select = headerline)

m <- matrix(c("1", "2", "3", "4", "5", "6", "7", "8", "9", 
              "HG00096", "HG00097", "HG00098", "HG00099", "HG00100", "HG00101","HG00102", "HG00103", "HG00103"), ncol=2)
index <- data.frame(V1= c("1", "4", "9"))
colnames(m) <- c("ind","headerline")
dudosos <- data.frame()
for (i in index$V1) { 
    dudosos <- rbind(dudosos, subset(x = as.data.frame(m) , 
                         subset = ind == i, select=headerline)) 
 }

again the other solutions are much better, but sometimes it also helps to see why the code you originally wrote was not working.

Subsetting matrix using numerical index - subset command and for loop with error

2 Answers

This is an explanation of why your code was not working: