Comparison of different way of implementing the elbow method

Question

I am confused as I see different ways to implement the elbow method to identify the correct number of clusters in Kmean and they produce slightly different results.

One method is described here Sklearn kmeans equivalent of elbow method and is using kmeans_inertia_ the other methos is described here https://pythonprogramminglanguage.com/kmeans-elbow-method/ and is using the following command.

distortions.append(sum(np.min(cdist(X, kmeanModel.cluster_centers_, 'euclidean'), axis=1)) / X.shape[0]) I am wondering what does Kmeans_inertia_ does ? and are both implementation correct ?

Has QUIT--Anony-Mousse Has QUIT--Anony-Mousse · Accepted Answer · 2018-05-25T17:42:42

There is no "correct" for something that is not at all well-defined.

The elbow method is an extremely crude heuristic for which I am not aware of any formal definition, nor a reference.

Both methods will supposedly most often yield the same k...

But by the concept of k-means, the "correct" way to use it is with squared errors, not with Euclidean distance. Because k-means minimizes squared errors, it does not minimize Euclidean distances (try to prove this! You can't because there are counterexamples).

Comparison of different way of implementing the elbow method

4 Answers