0
votes

Let's say I have a matrix that looks like this, and I convert it into a dist class object (without diagonal), and then into a vector for later purposes.

m  = matrix(c(0,1,2,3, 1,0,3,4, 2,3,0,5, 3,4,5,0), nrow=4)
#m:
     [,1] [,2] [,3] [,4]
[1,]    0    1    2    3
[2,]    1    0    3    4
[3,]    2    3    0    5
[4,]    3    4    5    0
md = as.dist(m, diag=F)
# md:
   1  2  3
2  1      
3  2  3   
4  3  4  5

mdv = as.vector(md)
# 1 2 3 3 4 5

I can access the original matrix as usual with [], and I could easily access the one-dimensional index (of, for example row 3, col 2) using m[ 3+((2-1)*4) ]. The dist object (and the vector) is one-dimensional, but composes only of the lower triangle of the original matrix (and also lacks one element from each original col/row, since the diagonal was removed).

How can I later access the equivalent element in the vector mdv? So e.g. how could I access the equivalent of m[3,2] (value 3) in the object mdv? (Not by the value, since there can be duplicate values, but by the index) Related Q&A resolve similar problems with as.matrix on the dist object, but that doesn't do it for me (since I need to deal with the vector).

2
You can convert the dist to matrix with as.matrixakrun

2 Answers

0
votes

How about this function:

fun <- function(r, c){
  stopifnot(r != c)
  if(r > c) (r-2)*(r-1)/2 + c
  else (c-2)*(c-1)/2 + r
}

mdv[fun(1, 2)] # 1
mdv[fun(2, 3)] # 3
mdv[fun(3, 4)] # 5
mdv[fun(2, 1)] # 1
mdv[fun(3, 2)] # 3
mdv[fun(1, 1)] # stop

Cases with r == c should be handled before applying fun. For convenience, You can write another function for handling this case.

0
votes

Having the lower.tri(, diag = FALSE) distances-vector ("mdv") you could (1) find the respective dimensions of the distances-matrix ("m") and (2) convert the i + (j - 1)*nrow indices accordingly by subtracting the equivalent missing "upper.tri".

ff = function(x, i, j) 
{
    #assumes that 'x' is a valid distances vector that results in correct 'n'
    n = (1 + sqrt(1 + 8 * length(x))) / 2 

    #make sure i >= j
    ii = pmax(i, j); jj = pmin(i, j)

    #insert 0s to handle 'i == j'
    x = c(unlist(lapply(split(x, rep(seq_len(n - 1), (n - 1):1)), 
                        function(X) c(0, X)), FALSE, FALSE), 0)

    #subtract the missing `upper.tri` elements
    x[(ii + (jj - 1L) * n) - cumsum(0:(n - 1))[jj]]
}

E.g.:

n = 3
m = matrix(0, n, n); m[lower.tri(m)] = runif(choose(n, 2)); m = m + t(m); x = c(as.dist(m))
m
#          [,1]      [,2]      [,3]
#[1,] 0.0000000 0.3796833 0.5199015
#[2,] 0.3796833 0.0000000 0.4770344
#[3,] 0.5199015 0.4770344 0.0000000
m[cbind(c(2, 2, 3, 1), c(3, 2, 1, 2))]
#[1] 0.4770344 0.0000000 0.5199015 0.3796833
ff(x, c(2, 2, 3, 1), c(3, 2, 1, 2))
#[1] 0.4770344 0.0000000 0.5199015 0.3796833

n = 23
m = matrix(0, n, n); m[lower.tri(m)] = runif(choose(n, 2)); m = m + t(m); x = c(as.dist(m))
i = sample(seq_len(n), 25, TRUE); j = sample(seq_len(n), 25, TRUE)
all.equal(m[cbind(i, j)], ff(x, i, j))
#[1] TRUE

etc...