3
votes

Given a set of points (rather small) in 2D and a maximum distance, find the minimum number of clusters where for each cluster, all the points inside have a distance between them of radius at maximum distance. The center of the clusters should be one of those point. Moreover, give the number of points inside each cluster.

It seems to be similar to k-means and this kind of stuff, but the maximum distance is given. I don't see a better solution than testing all the possibilities (maximum number of clusters is number of points). Is there a better solution ?

2
Sounds more like the set cover problem to me, than clustering. - Has QUIT--Anony-Mousse
So you have a set of points, and you want to draw circles with diameter D around some of the points, so that all the points are inside a circle, and you want to find the solution with the smallest number of circles? - m69 ''snarky and unwelcoming''
Exactly, in each disk each point have to be far at most of D (which is done if we use the diameter rather than radius) - XogoX

2 Answers

1
votes

As Anony-Mousse said, your problem seems to be "related" to the set cover problem. Using the same notation and terminology as in the Wikipedia page, the universe U is your set of points while the collection S contains |U| sets, one such set for each point u in U, containing all the points that are inside the disk centered at u of radius equal to your maximum distance. So your problem is to find the minimum number of sets in S (equivalent to finding the points that should be the centers of the clusters) such that the union of these sets is U.

Now, what I did above is reducing your problem to the set cover problem. This is the wrong direction we would like to perform the reduction in order to show that your problem is probably "difficult". To accomplish this, one would need to show that every instance of the set cover problem can be rephrased as an instance of your problem.

However, you said you have a rather small set of points. You could just define the collection S as above and then brute force it (i.e. try all possible sub-collections of S and take the one with the minimum number of sets in it, resulting in exponential complexity in the size of |S|). Regarding this approach, you would actually want to have a rather small number of sets in S, the number of points is not that important.

0
votes

Your problem can be solved in R by using the leaderCluster package. It uses Hartigan’s leader algorithm.