As all preprocessing modules in scikit-learn nowadays, PCA includes a transform
method that does exactly that, i.e. it transforms new samples according to an already fitted PCA transformation; from the docs:
transform
(self, X)
Apply dimensionality reduction to X.
X is projected on the first principal components previously extracted from a training set.
Here is a short demo with dummy data, adapting the example from the documentation:
import numpy as np
from sklearn.decomposition import PCA
X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
pca = PCA(n_components=2)
pca.fit(X)
X_new = ([[1, -1]]) # new data, notice the double array brackets
X_new_pca = pca.transform(X_new)
X_new_pca
# array([[-0.2935787 , 1.38340578]])
If you want to avoid the double brackets for a single new sample, you should make it into a numpy array and reshape it as follows:
X_new = np.array([1, -1])
X_new_pca = pca.transform(X_new.reshape(1, -1))
X_new_pca
# array([[-0.2935787 , 1.38340578]]) # same result