
I've got a two different vectors - one with zeros and random real numbers between 0-0.5 (vec1) and another ordered vector (vec2):

vec1 <- c(0.42887017, 0.26703377, 0, 0, 0, 0.33203175, 0.16787991, 0, 0, 0.19483491, 0.41869476, 0.05820833, 0.37449489, 0, 0, 0, 0, 0, 0.44390140, 0.19483491, 0.06736238, 0.31630117, 0, 0, 0, 0, 0, 0, 0.27121130, 0)
vec2 <- c(-0.1, -0.1, -0.1, -0.1, -0.1, 1.2, 1.2, 1.2, 1.2, 1.2, 0.5, 0.5, 0.5, 0.5, 0.5, 2.0, 2.0, 2.0, 2.0, 2.0, -0.6, -0.6, -0.6, -0.6, -0.6, 0.25, 0.25, 0.25, 0.25, 0.25)

For the first vector, vec1, I want create clusters of vectors > 0 and in the second vector, vec2, I want the equivalently positioned indexed elements to be clustered as follows (see bold):

vec1 -> 0.42887017, 0.26703377, 0, 0, 0, 0.33203175, 0.16787991, 0, 0, 0.19483491, 0.41869476, 0.05820833, 0.37449489, 0, 0, 0, 0, 0, 0.44390140, 0.19483491, 0.06736238, 0.31630117, 0, 0, 0, 0, 0, 0, 0.27121130, 0

vec2 -> -0.1, -0.1, -0.1, -0.1, -0.1, 1.2, 1.2, 1.2, 1.2, 1.2, 0.5, 0.5, 0.5, 0.5, 0.5, 2.0, 2.0, 2.0, 2.0, 2.0, -0.6, -0.6, -0.6, -0.6, -0.6, 0.25, 0.25, 0.25, 0.25, 0.25

Preferably the output should be in a lists of matrices with the equivalent indices:

          [,1] [,2]
[1,] 0.4288702 -0.1
[2,] 0.2670338 -0.1

          [,1] [,2]
[1,] 0.3320318  1.2
[2,] 0.1678799  1.2

           [,1] [,2]
[1,] 0.19483491  1.2
[2,] 0.41869476  0.5
[3,] 0.05820833  0.5
[4,] 0.37449489  0.5

           [,1] [,2]
[1,] 0.44390140  2.0
[2,] 0.19483491  2.0
[3,] 0.06736238 -0.6
[4,] 0.31630117 -0.6

          [,1] [,2]
[1,] 0.2712113 0.25

Has anybody got some ideas on how to do this?


what do you mean by equally indexed? By the looks of your example it looks like you want to retrieve the index of elements which are greater than 9 in vec1?monte
Sorry maybe my wording isn't right. I want to have the elements of vec2 to be clustered in the same position as vec1. Then I want to combine the corresponding clusters into 2-columned matrices.AlexLee
@Huntmerson please see my solution below and upvote it and accept it if it does what your original question requires.hello_friend

2 Answers


(Current Question) Base R solution:

# Cluster the data into groups, for each series of data above 0:
clustered <- subset(within(data.frame(cbind(vec1, vec2)),
                             grp <- cumsum(c(TRUE, diff(vec1) == vec1[-1]))
                           }), vec1 > 0)

# Split the dataframe into a list for each group, remove group vector:
setNames(split(within(clustered, rm("grp")), clustered$grp), 

Current Data:

vec1 <- c(0.42887017, 0.26703377, 0, 0, 0, 0.33203175, 0.16787991, 0, 0, 0, 0.41869476, 0.05820833, 0.37449489, 0, 0, 0, 0, 0, 0.44390140, 0.19483491, 0.06736238, 0.31630117, 0, 0, 0, 0, 0, 0, 0.27121130, 0)
vec2 <- c(-0.1, -0.1, -0.1, -0.1, -0.1, 1.2, 1.2, 1.2, 1.2, 1.2, 0.5, 0.5, 0.5, 0.5, 0.5, 2.0, 2.0, 2.0, 2.0, 2.0, -0.6, -0.6, -0.6, -0.6, -0.6, 0.25, 0.25, 0.25, 0.25, 0.25)

(Original Question) Base R solution:

clustered <- subset(within(data.frame(cbind(vec1, vec2)), 
       {grp <- cumsum(c(TRUE, abs(diff(vec1 > 9))))}), vec1 > 9)

setNames(Map(function(x){within(x, rm("grp"))}, 
    split(clustered, clustered$grp)), c(1:length(unique(clustered$grp))))

I managed to find a way that works myself, maybe a bit complex:

list1 = list()

clust = c()
clust2 = c()

x = 1

for (i in 1:length(vec1)) {
  if (vec1[i] > 0 & i != length(vec1)) {
    clust = c(clust,vec1[i])
    clust2 = c(clust2,vec2[i])
  } else if (vec1[i] == 0 & length(clust) > 0) {
    list1[[x]] <- cbind(clust,clust2)
    x = x + 1
    clust = c()
    clust2 = c()
  } else if (i == length(vec1) & vec1[length(vec1)] > 0){
    clust = c(clust ,vec1[i])
    clust2 = c(clust2,vec2[i])
    list1[[x]] <- cbind(clust,clust2)
  } else {

The output is:

> list1
         clust clust2
[1,] 0.4288702   -0.1
[2,] 0.2670338   -0.1

         clust clust2
[1,] 0.3320318    1.2
[2,] 0.1678799    1.2

          clust clust2
[1,] 0.19483491    1.2
[2,] 0.41869476    0.5
[3,] 0.05820833    0.5
[4,] 0.37449489    0.5

          clust clust2
[1,] 0.44390140    2.0
[2,] 0.19483491    2.0
[3,] 0.06736238   -0.6
[4,] 0.31630117   -0.6

         clust clust2
[1,] 0.2712113   0.25