0
votes

There are few well known measures like silhouette width (SW), the Davies- Bouldin index (DB), the Calinski-Harabasz index (CH), and the Dunn index . How can we say that a clustering quality measure is good?
Is there some kind of metric for the clustering quality measure to be good?

Also ,

"algorithms that produce clusters with high Dunn index are more desirable" -Wikipedia

"Objects with a high silhouette value are considered well clustered" -Wikipedia

"clustering algorithm that produces a collection of clusters with the smallest Davies–Bouldin index is considered the best algorithm" -Wikipedia

How high or low these values should be ?Is there a metric number ?

Can any one provide me a small example using a clustering quality measure on a dataset or IRIS dataset to say that the particular clustering quality measure is good?

1

1 Answers

0
votes

Maybe a simple starting point would be:

"Are the elements within a cluster alike and are they different from elements in a different cluster".

There are obviously a variety of metrics to quantify similarity vs difference - as well as considerations like density vs distance.

The Stanford NLP project has a useful reference that is approachable: http://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-clustering-1.html