Matching columns across matrices

Question

Say I have two or more matrices. The number of rows and columns are same across matrices. But the matrices are not necessarily square.

Matrix1
     a        b        c
1    0.911    0.067    0.023
2    0.891    0.089    0.019
3    0.044    0.931    0.025
4    0.919    0.058    0.023

Matrix2
     a        b        c
1    0.024    0.070    0.906
2    0.020    0.090    0.891
3    0.025    0.930    0.045
4    0.024    0.058    0.918

Rows always sum to 1. Columns may shift position from matrix to matrix. So column names do not mean anything much. Example above, column 'a' in mat 1 is column 'c' in mat2. The value will not be identical but similar.

What sort of approach/algorithm can I use to align the columns across many such matrices?

The desired result would be something like below

Matrix1
     a        b        c
1    0.911    0.067    0.023
2    0.891    0.089    0.019
3    0.044    0.931    0.025
4    0.919    0.058    0.023

Matrix2
     c        b        b
1    0.906    0.070    0.024
2    0.891    0.090    0.020
3    0.045    0.930    0.025
4    0.918    0.058    0.024

That the columns are aligned. 'a' in mat1 corresponds to 'c' in mat2 and so on. In this one possible result, mat1 is the reference and mat2 was aligned to it.

I am using R if anyone wants to try something.

mat1 <-
 matrix(c(0.911,0.891,0.044,0.919,0.067,0.089,0.931,0.058,0.023,0.019,0.025,0.023),nrow=4)
mat2 <-
 matrix(c(0.024,0.020,0.025,0.024,0.070,0.090,0.930,0.058,0.906,0.891,0.045,0.918),nrow=4)

Please show desired result as to align the columns across many such matrices is not clear. — Parfait
Basically you want to identify columns in matrix 2 and name them as their most similar column in matrix 1? — Roman Luštrik
Maybe as simple as mat2[,order(mat1[1,]- mat2[1,])] or mat2[,max.col(-outer(mat1[1,], mat2[1,], function(i, j) abs(i-j)))] — Sotos

Andrew Gustar Andrew Gustar · Accepted Answer · 2017-08-13T13:41:00

You could do something like this. The function returns the column indices of mat in the order that best matches (by Euclidean distance) the columns of m.base.

col.order <- function(m.base, mat){
  no.cols <- ncol(mat)
  col.ord <- rep(NA, no.cols)
  for(i in 1:no.cols){
    vec <- m.base[, i]
    col.dists <- apply(mat, 2, function(x) sum((x-vec)^2))
    best.dist <- min(col.dists[is.na(col.ord)])
    best.col <- match(best.dist, col.dists)
    col.ord[best.col] <- i
  }
  return(col.ord)
}

mat2[, col.order(mat1,mat2)]

      [,1]  [,2]  [,3]
[1,] 0.906 0.070 0.024
[2,] 0.891 0.090 0.020
[3,] 0.045 0.930 0.025
[4,] 0.918 0.058 0.024

Matching columns across matrices

2 Answers