I would like to apply PCA on heatmaps of 18 dimensions.
dim(heatmaps)=(224,224,18)
Since PCA takes only data of dim <= 2. I reshape my heatmaps as follow :
heatmaps=heatmaps.reshape(-1,18)
heatmaps.shape
(50176, 18)
Now, l would to apply PCA and take the first components that preserve 95% of variance.
from sklearn.decomposition import PCA
pca = PCA(n_components=18)
reduced_heatmaps=pca.transform(heatmaps)
However the dimension of reduced_heatmaps
remains the same as the original heatmaps
(50176, 18).
My question is as follow : How to reduce the dimensionality of my heatmaps while preserving 95% of variance ?
Strange thing
pca.explained_variance_ratio_.cumsum()
array([ 0.05744624, 0.11482341, 0.17167621, 0.22837643, 0.284996 ,
0.34127299, 0.39716828, 0.45296374, 0.50849681, 0.56382308,
0.61910508, 0.67425335, 0.72897448, 0.78361028, 0.83813329,
0.89247688, 0.94636864, 1. ])
It means, I need to keep 17 components to reduce the dimensionality of my data such that l have 18 dimensions.
What is wrong ?
EDIT : following the suggestions of Eric Yang
heatmaps=heatmaps.reshape(18,-1)
heatmaps.shape
(18,50176)
Then applying PCA as follow :
pca = PCA(n_components=11)
reduced_heatmaps=pca.fit_transform(heatmaps)
pca.explained_variance_ratio_.cumsum()
results the following :
array([ 0.21121199, 0.33070526, 0.44827572, 0.55748779, 0.64454442,
0.72588593, 0.7933346 , 0.85083687, 0.89990991, 0.9306283 ,
0.9596194 ], dtype=float32)
11 components is needed to explain 95% variance of my data.
reduced_heatmaps.shape
(18, 11)
Hence we go from (18,50176) to (18, 11)
Thank you for your help