Subset data frame with matrix of logical values

Question

Problem

I have data on two measures for four individuals each in a wide format. The measures are x and y and the individuals are A, B, C, D. The data frame looks like this

d <- data.frame(matrix(sample(1:100, 40, replace = F), ncol = 8))
colnames(d) <- paste(rep(c("x.", "y."),each = 4), rep(LETTERS[1:4], 2), sep ="")
d

  x.A x.B x.C x.D y.A y.B y.C y.D
1  56  65  42  96 100  76  39  26
2  19  93  94  75  63  78   5  44
3  22  57  15  62   2  29  89  79
4  49  13  95  97  85  81  60  37
5  45  38  24  91  23  82  83  72

Now, would I would like to obtain for each row is the value of y for the individual with the lowest value of x.

So in the example above, the lowest value of x in row 1 is for individual C. Hence, for row 1 I would like to obtain y.C which is 39.

In the example, the resulting vector should be 39, 63, 89, 81, 83.

Approach

I have tried to get to this by first generating a matrix of the subset of d for the values of x.

t(apply(d[,1:4], 1, function(x) min(x) == x))

       x.A   x.B   x.C   x.D
[1,] FALSE FALSE  TRUE FALSE
[2,]  TRUE FALSE FALSE FALSE
[3,] FALSE FALSE  TRUE FALSE
[4,] FALSE  TRUE FALSE FALSE
[5,] FALSE FALSE  TRUE FALSE

Now I wanted to apply this matrix to subset the subset of the data frame for the values of y. But I cannot find a way to achieve this.

Any help is much appreciated. Suggestions for a totally different - more elegant - approach are highly welcome too.

Thanks a lot!

You can subset the set of y values directly with the Boolean mask: d[,5:8][t(apply(d[,1:4], 1, function(x) min(x) == x))] — alistaire
You can check the solution below. It should be faster compared to the apply method in your post — akrun
@alistaire That method wouldn't give the expected output in the correct order. I would use t(d[,5:8])[apply(d[,1:4], 1, function(x) min(x) == x)] — akrun

akrun akrun · Accepted Answer · 2016-03-09T09:50:32

We subset the dataset with the columns starting with 'x' ('dx') and 'y' ('dy'). Get the column index of the minimum value in each row of 'dx' using max.col, cbind with the row index and get the corresponding elements in 'dy'.

 dx <- d[grep('^x', names(d))]
 dy <- d[grep('^y', names(d))]
 dy[cbind(1:nrow(dx),max.col(-dx, 'first'))]
 #[1] 39 63 89 81 83

The above can be easily be converted to a function

 get_min <- function(dat){
     dx <- dat[grep('^x', names(dat))]
     dy <- dat[grep('^y', names(dat))]
     dy[cbind(1:nrow(dx), max.col(-dx, 'first'))]
   }
get_min(d)
#[1] 39 63 89 81 83

Or using the OP's apply based method

t(d[,5:8])[apply(d[,1:4], 1, function(x) min(x) == x)] 
#[1] 39 63 89 81 83

data

d <- structure(list(x.A = c(56L, 19L, 22L, 49L, 45L),
x.B = c(65L, 
93L, 57L, 13L, 38L), x.C = c(42L, 94L, 15L, 95L, 24L), 
x.D = c(96L, 
75L, 62L, 97L, 91L), y.A = c(100L, 63L, 2L, 85L, 23L), 
y.B = c(76L, 
78L, 29L, 81L, 82L), y.C = c(39L, 5L, 89L, 60L, 83L), 
y.D = c(26L, 
44L, 79L, 37L, 72L)), .Names = c("x.A", "x.B", "x.C", 
"x.D", 
"y.A", "y.B", "y.C", "y.D"), class = "data.frame", 
row.names = c("1", "2", "3", "4", "5"))

Subset data frame with matrix of logical values

3 Answers

data