Mathematically how does one compare classification result to clustering results

Question

Is there a standard methodology to compare results (for accuracy) of a classification algorithm against a clustering algorithm? I have data that has only two true labels. Easy enough to check accuracy when I run a binary classification on it, but if I run clustering, where I ask it to cluster the data into 5 groups, how can I check the accuracy and compare it to the binary classification. I know clustering is not suitable for (two label) data but how can one prove this mathematically?

user2566092 user2566092 · Accepted Answer · 2014-04-17T20:11:25

Clustering into more than two clusters is one way to do 2-class classification (just pick which ever label is more common in each cluster to be the predicted label for the cluster). However it's a very strange approach because it ignores the labels until the very end after the clustering is computed. Supervised learning (i.e. classification) provides much more powerful tools like random forests for classification.

Mathematically how does one compare classification result to clustering results

2 Answers

Don't approach clustering as classification

Don't try to stick a number on everything