1
votes

I thought that this

 kmeans(x = matrix(1:50, 5), centers = 2, iter.max = 10)

Could be written as:

matrix(1:50, 5) %>% 
map( ~kmeans(x = .x, centers = 2, iter.max = 10))

Error in sample.int(m, k) : 
  cannot take a sample larger than the population when 'replace = FALSE'

But the second does not work. How do I use kmeans in combination with purrr::map()?

1
Why do you need map here? matrix(1:50, 5) %>% kmeans(., centers = 2, iter.max = 10). A matrix is a vector with dim attributes. When you do map, it goes through each single observation. - akrun
@akrun because in my original example I have several matrixes (scaled, with/without certain variables,etc.), and I would like to compare the results of the clustering against each other. - Dambo
Not sure I get it. If you have several matrices in a list, then map can be applied - akrun
Your approach would work if it is in a list i.e. list(matrix(1:50, 5), matrix(51:100, 5)) %>% map( ~kmeans(x = .x, centers = 2, iter.max = 10)) - akrun

1 Answers

2
votes

The matrix, by itself is a vector with dim attributes. So, when we directly apply map on the matrix, it goes through the each of the individual elements. Instead, place it in a list

list(matrix(1:50, 5) ) %>% 
         map( ~kmeans(x = .x, centers = 2, iter.max = 10))

Note that for a single matrix, we don't need map

 matrix(1:50, 5) %>% 
      kmeans(., centers = 2, iter.max = 10)

It becomes useful when we have a list of matrices

list(matrix(1:50, 5), matrix(51:100, 5)) %>% 
            map( ~kmeans(x = .x, centers = 2, iter.max = 10))