I was doing PCA on a dataset. In order to find the optimal number of PCA's, I used the number of features as the number of PCA. However, when I looked at the explained variance ratio, I noticed that the number of PCA's has changed. Originally, the dataset was 200 * 300, so after doing PCA with # of components = 300, I should get 300 PCA's and their corresponding variance ratios back, but I got 200.
Code is here:
# Standardize the data
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
# Find the optimal number of PCA
#pca.explained_variance_ratio_
pca = PCA()
pca.fit(X_train_scaled)
ratios = pca.explained_variance_ratio_
I just figured out why, so will answer this question below.