So i have a dataset with pictures, where each column consist of a vector that can be reshaped into a 32x32 picture. The specific dimensions of my dataset is the following 1024 x 20000. Meaning 20000 samples of images.
Now when i look at various ways of doing PCA without using the built in functions from something like scikit-learn people tend to take either the mean of the rows and subtract the resulting matrix from the original one to get the covariance matrix. I.e the following
A = (1024x20000) #dimensions of the numpy array
mean_rows = A.mean(0)
new_A = A-mean_rows
Other times people tend to get the mean of the columns and the subtract that from the original matrix.
A = (1024x20000) #dimensions of the numpy array
mean_rows = A.mean(1)
new_A = A-mean_rows
Now my question is, when are you supposed to do what? Say i have a dataset as my example which of the methods would i use?
Looked at a variety of websites such as https://machinelearningmastery.com/calculate-principal-component-analysis-scratch-python/, http://sebastianraschka.com/Articles/2014_pca_step_by_step.html