1
votes

I am trying to determine which columns were sampled from a matrix randomly sampled within each row. The function sample does not appear to have the ability to tell you which locations were actually sampled. Now, a simple matching routine can solve the problem if all values are unique. However, they are not in my case, so this will not work.

x <- c(2,3,5,1,6,7,2,3,5,6,3,5)
y <- matrix(x,ncol=4,nrow=3)
random <- t(apply(y,1,sample,2,replace=FALSE))

y

   [,1] [,2] [,3] [,4]
[1,]    2    1    2    6
[2,]    3    6    3    3
[3,]    5    7    5    5

random

     [,1] [,2]
[1,]    2    6
[2,]    3    3
[3,]    5    5

With repeated values in the original matrix, I cannot tell if random[1,1] was sampled from column 1 or column 3, since they both have a value of 2. Hence, matching won't work here.

Accompanying the matrix "random" I would also like a matrix that gives the column from which each value was sampled, in an identically sized matrix. For example, such as:

     [,1] [,2]
[1,]    1    4
[2,]    1    3
[3,]    3    4

Thanks!

1

1 Answers

5
votes

You need to save your random selections from sample separately so you don't have to worry about matching later. E.g., using y again:

y
#     [,1] [,2] [,3] [,4]
#[1,]    2    1    2    6
#[2,]    3    6    3    3
#[3,]    5    7    5    5

set.seed(42)
randkey <- t(replicate(nrow(y),sample(1:ncol(y),2)))
#     [,1] [,2]
#[1,]    4    3
#[2,]    2    3
#[3,]    3    2

random <- matrix(y[cbind(c(row(randkey)), c(randkey))], nrow(y))
#     [,1] [,2]
#[1,]    6    2
#[2,]    6    3
#[3,]    5    7