I am trying to cluster a 2 dimensional user data using kmeans in sklearn python. I used the elbow method (point where the increase in cluster no. does not bring significant dip in the sum of square errors) to identify the correct no. of clusters as 50.
Post applying kmeans, i wish to understand the similarity of datapoints within each cluster. Since i have 50 clusters, is there a way to get a number (something like variance within each cluster) which could help me understand how close or datapoints within each of them. A number like 0.8 would mean that the records have high variance within each cluster while a 0.2 would mean they are closely "related".
So to summarize, is there any way to get a single number to identify how "good" each cluster in kmeans is? We can argue that goodness is relative, but lets consider that i am more interested in the within cluster variance to identify how good a particular cluster is.