If I applied PCA on feature vectors and then I do clustering, such like following:
reduced_data = PCA(n_components=2).fit_transform(data)
kmeans = KMeans(init='k-means++', n_clusters=n_digits, n_init=10)
kmeans.fit(reduced_data)
The reduced data will be the in terms of PCA components, so after clustering in kmean, you can get a label for each point (reduced_data), how to know which one from the origin data?
how to play with a number of PCA components regarding the number of clusters? Thanks.