I have a large distance matrix and a corresponding dataframe, a miniature example is :
A = matrix( c(0, 1, 2, 1, 0, 2, 2, 2, 0), nrow=3, ncol=3, byrow = TRUE)
dimnames(A) = list(c("A1", "B1", "C1"), c("A1", "B1", "C1"))
df <- data.frame("ID" = c("A1", "B1", "C1"), "Triplicate" = c("T1", "T1", "T1"))
A1, B1, C1
are technical replicates of each other as indicated by identical value in Triplicate
column in the df
. Matrix A indicates "distance or dissimilarity" of samples from one another. How can I group the matrix so that I append a column to df
such that for any sample it is:
a. the minimum of the two distances from its corresponding triplicate samples. So for example, in matrix A
A1:B1 distance is 1 and A1:C1 distance is 2, so append minimum of these two distance values for A1 in df
column minimum
as 1 and likewise minimum of distance from A1 and C1 for B1 and A1 and B1 for C1, giving me:
df$minimum <- c(1, 1, 2)
df
b. Similarly, I would like to append another column average so that it is the average of two distances, so for A1 average of distances from C1 and B1 is (1+2)/2 = 1.5, similarly for B1 and C1 giving me:
df$average <- c(1.5, 1.5, 2)
df
Hope this is much clearer, I have many such Triplicate
samples, so referring to it while matching sample distances is important. I will address any questions immediately...
Thanks!