Create matrix with aggregated values for row groups

Question

I have a large matrix that contains various features extracted from microscopic cell images. The different features are distributed across the columns, the individual cells across the rows of that matrix. However, the measurements come from time lapse microscopy, such that each individual cell has 90 rows (time points) in that matrix. So this matrix has the dimension [cell_amount*90; feature_amount].

My goal is to:

calculate the difference of subsequent time points for each cell (the "derivative" of the time series), and then
create a new matrix that contains an aggregation of those differences for each cell (so that new matrix has the dimension [cell_amount; feature_amount]).

I set up some code in R to test my problem, where I have 4 cells, 4 features (columns) and each cell has 3 time point values. So the first cell would be on rows 1-3, the second on row 4-6, and so on. From this I calculate the difference of the values:

A <- matrix(sample(1:100, 4*12), ncol = 4)
B <- abs( A - dplyr::lag(A) )
B[seq(1,nrow(B), 3),] <- NA

This results in a matrix where the first row of each cell contains NA values:

       [,1] [,2] [,3] [,4]
[1,]    NA   NA   NA   NA
[2,]    82   29   54   22
[3,]    32   44   18   31
[4,]    NA   NA   NA   NA
[5,]    22   61   10   33
[6,]    19   64   54   35
[7,]    NA   NA   NA   NA
[8,]    59   18    6   10
[9,]    34   47   70    6
[10,]   NA   NA   NA   NA
[11,]   60   23   68   22
[12,]   17   13   12    9

The resulting matrix containing an aggregation for those values for each cell, in this case the variance, should then look like:

       [,1]   [,2]  [,3]  [,4]
[1,]    1250  112.5 648   40.5
[2,]    4.5   4.5   968   2
[3,]    312.5 420.5 2048  8
[4,]    924.5 50    1568  84.5

How can I calculate this new matrix in R? Any help is appreciated.

So what's the desired output for this input? Gave you give specific values so that possible solutions can be tested? — MrFlick
Thank you for your input. I reformulated the problem and gave an expected output matrix. The values are for testing purposes, exactly. — user3182899

MrFlick MrFlick · Accepted Answer · 2017-04-12T14:49:07

Because you used a random sample without a seed, I can't re-create your A matrix. However, here's a recreation of your B matrix.

B <- matrix(scan(text="
NA   NA   NA   NA
82   29   54   22
32   44   18   31
NA   NA   NA   NA
22   61   10   33
19   64   54   35
NA   NA   NA   NA
59   18    6   10
34   47   70    6
NA   NA   NA   NA
60   23   68   22
17   13   12    9"), ncol=4, byrow=T)

If you really want to keep this a matrix, you can reshape this into a multi-dimensional array and the use apply over the margins to get the value of interest, for example

apply(array(B, dim=c(3,4,4)),2:3, var, na.rm=T)
#        [,1]  [,2] [,3] [,4]
# [1,] 1250.0 112.5  648 40.5
# [2,]    4.5   4.5  968  2.0
# [3,]  312.5 420.5 2048  8.0
# [4,]  924.5  50.0 1568 84.5

You could also create a proper grouping variable and use aggregate()

row_sample <- rep(1:3, each=nrow(B)/3)
aggregate(B, list(row_sample), var, na.rm=T)
#   Group.1        V1       V2        V3        V4
# 1       1 1250.0000 112.5000  648.0000  40.50000
# 2       2  496.3333 662.3333  709.3333 193.00000
# 3       3  469.0000 305.3333 1084.0000  72.33333

Create matrix with aggregated values for row groups

1 Answers