Clustering : No single point clusters

Question

I have 4-dimensional data which needs to be clustered to build minimum volume bounding ellipsoids for each cluster. I don't want to have single point clusters or at least, as less number of single point clusters as possible, because we can't build an ellipsoidal confidence region with a single point. In my problem, number of clusters are not given in advance. So I am using Scikit-learn's Affinity Propagation - http://scikit-learn.org/stable/modules/clustering.html#affinity-propagation to estimate the number of clusters and perform clustering from the data. But this approach is giving me so many single point clusters. Can you give an Insight on how to solve this problem ?

P.S : To give you even more information, I am working on ellipsoidal nested sampling for Bayesian evidence calculation.

Removing the outliers would certainly help. You might even want to consider using a Gaussian Mixture Model. The number of component gaussians can be selected based on AIC or BIC criterion. — hrs

Mehraban Mehraban · Accepted Answer · 2014-07-09T13:12:23

I don't know if you insist on Affinity Propagation or not, but using DBSCAN, you can achieve what you want by algorithm parameters, eps and minPts.

Greater eps means clusters with less density can be detected and near clusters would be merged also.

Greater minPts means you would mark more data as noise.

Clustering : No single point clusters

2 Answers