Identify duplicate islands in a matrix and change their values

Question

I have a matrix:

m <- matrix(c(
  1,    1,    1,    0,    0,    0,
  0,    0,    0,    0,    0,    0,
  3,    0,    0,    0,    0,    0,
  3,    0,    0,    0,    0,    2,
  3,    0,    0,    0,    0,    0,
  3,    0,    0,    0,    2,    2),
  ncol = 6, byrow = TRUE)

     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1    1    1    0    0    0
[2,]    0    0    0    0    0    0
[3,]    3    0    0    0    0    0
[4,]    3    0    0    0    0    2 # <- island 3, value 2
[5,]    3    0    0    0    0    0
[6,]    3    0    0    0    2    2 # <- island  4, also value 2

In this matrix, there are four 'islands', i.e. non-zero values separated by zeros:

(1) an island composed of three 1's, (2) four 3's, (3) one 2, and (4) two 2's.

Thus, two islands are composed of the value 2. I want to identify such 'duplicate' islands and change the values of one of the 'islands' (either will do) to the next available number (4 in this case):

     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1    1    1    0    0    0
[2,]    0    0    0    0    0    0
[3,]    3    0    0    0    0    0
[4,]    3    0    0    0    0    2
[5,]    3    0    0    0    0    0
[6,]    3    0    0    0    4    4

Not clear (I see 3 'islands', -- why is the next value 4?, ...). Please explain it more and give an expected output as well. Also make sure your examples are reproducible — Sotos
There are 4 islands: one made up of 1's, one made up of 3's and 2 made up of 2's. 4 is a number that hasn't already been taken and is one more than the maximum number already used. — user3651829
As I said in the original question, change either of the islands labelled '2' to '4' but not both. — user3651829

Julius Vainora Julius Vainora · Accepted Answer · 2018-11-20T12:18:55

Fun question! Let's take a more involved case

(M <- matrix(c(1, 0, 3, 3, 3, 3, 1, 0, 0, 0, 0, 0, 1, 0, 3, 0, 2, 
               0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 2, 1, 0, 0, 2, 0, 2), 6, 6))
#      [,1] [,2] [,3] [,4] [,5] [,6]
# [1,]    1    1    1    0    0    1
# [2,]    0    0    0    0    0    0
# [3,]    3    0    3    3    0    0
# [4,]    3    0    0    0    0    2
# [5,]    3    0    2    0    0    0
# [6,]    3    0    0    0    2    2

Here's a graph-based solution.

library(igraph)
# Indices of nonzero matrix elements
idx <- which(M != 0, arr.ind = TRUE)
# Adjacency matrix for matrix entries
# Two entries are adjacent if their column or row number differs by one
# Also, due to idx, an implicit condition is also that the two entries are the same
adj <- 1 * (as.matrix(dist(idx, method = "manhattan")) == 1)
# Creating loops as to take into account singleton islands
diag(adj) <- 1
# A corresponding graphs
g <- graph_from_adjacency_matrix(adj, mode = "undirected")
# Connected components of this graph
cmps <- clusters(g)
# Going over unique values of M
for(i in 1:max(M)) {
  # Islands of value i
  un <- unique(cmps$membership[M[idx] == i])
  # More than one island?
  if(length(un) > 1)
    # If so, let's go over islands 2, 3, ...
    for(cmp in un[-1])
      # ... and replace corresponding matrix entries by max(M) + 1
      M[idx[cmps$membership == cmp, , drop = FALSE]] <- max(M) + 1
}

M
#      [,1] [,2] [,3] [,4] [,5] [,6]
# [1,]    1    1    1    0    0    4
# [2,]    0    0    0    0    0    0
# [3,]    3    0    7    7    0    0
# [4,]    3    0    0    0    0    6
# [5,]    3    0    2    0    0    0
# [6,]    3    0    0    0    5    5

Also note that using adj alone we could find all the islands if we could find its permutation leading to a block-diagonal matrix with the maximal number of blocks. Then each block would correspond to an island. However, I couldn't find an R implementation of a relevant procedure.

Identify duplicate islands in a matrix and change their values

4 Answers