I am trying to apply principal component analysis, to reduce the dimensions of my data. 200x146 , 200 observations(samples) with 146 features(dimensions), each observation can belong to one of three classes. What I am trying to do is to visualize the data, to see how the class centroids move after adding new samples to my data. Since it’s impossible to plot such high dimensional data, I am looking for a dimension that would represent my data in almost separate classclusters.
I know that PCA, calculates the eigenvalues of the eigenvectors, while the eigenvalues represent the variance. The higher the variance the more the data is spread out and better to visualize. The eigenvector with the highest eigenvalue is the principal component, an axis orthogonal to this component is then found by the PCA. (Did I understand the basic idea of PCA correctly?)
However I don’t understand, what information I do get when I use the matlab function pca() I get the coefficient, but what do they tell me and how to I proceed afterwards? )';
data=trndata;
[coeff,score]=pca(data(:,1:end-1));
newinputdata=coeff(:, 1:3)*score(:, 1:3
newinputdata=newinputdata';
class1i=find(data(:,end)==1);
class2i=find(data(:,end)==2);
class3i=find(data(:,end)==3);
class1=newinputdata(class1i,:);
class2=newinputdata(class2i,:);
class3=newinputdata(class3i,:);
x=1;
y=2;
figure;
plot(class1(:,x), class1(:,y),'ro')
hold on
plot(class2(:,x), class2(:,y),'go')
hold on
plot(class3(:,x), class3(:,y),'bo')