2
votes

I am trying to do k means clustering in scikit learn. Code:

from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters = 10)
x = df.values
kmeans.fit(x.reshape(-1, 1))

If the parameter n_init = random, it chooses random initial centroids. Is there a way to fetch the initial centroids used?

2

2 Answers

2
votes

You can only get your cluster centers after fitting the KMeans object to your data.

Little trick !

So what you can do is set the parameter max_iter to 1. By default it is set to 300 and then centers may change at each iteration.

If you use only one iteration the algorithm will assign each sample to an initial center and then it will stop, not updating the centers.

Thus calling .cluster_centers_ will give back the initial centroids !

-2
votes

Yes, I assume you can try centroids = kmeans.cluster_centers_ before calling fit()