I'm trying to apply a PCA dimensionality reduction to a dataset that it's 684 x 1800 (observations x features). I want to reduce the amount of features. When I perfom the PCA, it tells me that to obtain the 100% of variance explained, there should be 684 features, so my data should be 684 x 684. Is it not too strange? I mean, exactly the same number...
Is there any explanation or I'm applying the PCA wrongly?
I know that there're needed 684 components to explain the whole variance cause I plot the cumulative sum of .explained_variance_ratio and it sums 1 with 684 components. And also because of the code below.
My code is basically:
pca = PCA(0.99999999999)
pca.fit(data_rescaled)
reduced = pca.transform(data_rescaled)
print(reduced.shape)
print(pca.n_components_)
Of course, I don't want to keep the whole variance, 95% is also acceptable. It is just a wonderful serendipity?
Thank you so much