0
votes

Say I have two or more matrices. The number of rows and columns are same across matrices. But the matrices are not necessarily square.

Matrix1
     a        b        c
1    0.911    0.067    0.023
2    0.891    0.089    0.019
3    0.044    0.931    0.025
4    0.919    0.058    0.023

Matrix2
     a        b        c
1    0.024    0.070    0.906
2    0.020    0.090    0.891
3    0.025    0.930    0.045
4    0.024    0.058    0.918

Rows always sum to 1. Columns may shift position from matrix to matrix. So column names do not mean anything much. Example above, column 'a' in mat 1 is column 'c' in mat2. The value will not be identical but similar.

What sort of approach/algorithm can I use to align the columns across many such matrices?

The desired result would be something like below

Matrix1
     a        b        c
1    0.911    0.067    0.023
2    0.891    0.089    0.019
3    0.044    0.931    0.025
4    0.919    0.058    0.023

Matrix2
     c        b        b
1    0.906    0.070    0.024
2    0.891    0.090    0.020
3    0.045    0.930    0.025
4    0.918    0.058    0.024

That the columns are aligned. 'a' in mat1 corresponds to 'c' in mat2 and so on. In this one possible result, mat1 is the reference and mat2 was aligned to it.

I am using R if anyone wants to try something.

mat1 <-
 matrix(c(0.911,0.891,0.044,0.919,0.067,0.089,0.931,0.058,0.023,0.019,0.025,0.023),nrow=4)
mat2 <-
 matrix(c(0.024,0.020,0.025,0.024,0.070,0.090,0.930,0.058,0.906,0.891,0.045,0.918),nrow=4)
2
Please show desired result as to align the columns across many such matrices is not clear.Parfait
Basically you want to identify columns in matrix 2 and name them as their most similar column in matrix 1?Roman Luštrik
I have edited to add the desired result.rmf
Maybe as simple as mat2[,order(mat1[1,]- mat2[1,])] or mat2[,max.col(-outer(mat1[1,], mat2[1,], function(i, j) abs(i-j)))]Sotos

2 Answers

1
votes

You could do something like this. The function returns the column indices of mat in the order that best matches (by Euclidean distance) the columns of m.base.

col.order <- function(m.base, mat){
  no.cols <- ncol(mat)
  col.ord <- rep(NA, no.cols)
  for(i in 1:no.cols){
    vec <- m.base[, i]
    col.dists <- apply(mat, 2, function(x) sum((x-vec)^2))
    best.dist <- min(col.dists[is.na(col.ord)])
    best.col <- match(best.dist, col.dists)
    col.ord[best.col] <- i
  }
  return(col.ord)
}

mat2[, col.order(mat1,mat2)]

      [,1]  [,2]  [,3]
[1,] 0.906 0.070 0.024
[2,] 0.891 0.090 0.020
[3,] 0.045 0.930 0.025
[4,] 0.918 0.058 0.024
0
votes

Assuming that each column will always have a pretty good match, this should work.

Matrix2[, sapply(1:ncol(Matrix1), 
     function(i) which.min(colSums(abs(Matrix2 - Matrix1[,i]))))]
      c     b     a
1 0.906 0.070 0.024
2 0.891 0.090 0.020
3 0.045 0.930 0.025
4 0.918 0.058 0.024