0
votes

I have a large distance matrix and a corresponding dataframe, a miniature example is :

A = matrix( c(0, 1, 2, 1, 0, 2, 2, 2, 0), nrow=3, ncol=3, byrow = TRUE)        
dimnames(A) = list(c("A1", "B1", "C1"), c("A1", "B1", "C1"))
df <- data.frame("ID" = c("A1", "B1", "C1"), "Triplicate" = c("T1", "T1", "T1"))

A1, B1, C1 are technical replicates of each other as indicated by identical value in Triplicate column in the df. Matrix A indicates "distance or dissimilarity" of samples from one another. How can I group the matrix so that I append a column to df such that for any sample it is:

a. the minimum of the two distances from its corresponding triplicate samples. So for example, in matrix A A1:B1 distance is 1 and A1:C1 distance is 2, so append minimum of these two distance values for A1 in df column minimum as 1 and likewise minimum of distance from A1 and C1 for B1 and A1 and B1 for C1, giving me:

df$minimum <- c(1, 1, 2) 
df

b. Similarly, I would like to append another column average so that it is the average of two distances, so for A1 average of distances from C1 and B1 is (1+2)/2 = 1.5, similarly for B1 and C1 giving me:

df$average <- c(1.5, 1.5, 2)
df

Hope this is much clearer, I have many such Triplicate samples, so referring to it while matching sample distances is important. I will address any questions immediately...

Thanks!

1

1 Answers

0
votes

How about

df$minimum <- apply(A, 1, function(x) min(x[x > 0]))
df$average <- apply(A, 1, function(x) mean(x[x > 0]))