Principal Component Analysis?

Question

I am strugling with PCA stuff.

So for example I have :

Data=100*3
substractdata=data-mean (the size will be same 100*3)
covariance=3*3
EigenVector=3*3
EigenValue=3*3

And to do reduction to our data, we have to eliminate the number of eigen value and eigen vector based on k

For example k=2

so the number of

EigenValue will become 2*2
EigenVector = 2*2

1st ques: is that right?

And then we have to project out matrix

project=EigenVector (which is 2*2) *substractdata (100*3)

2nd ques: How we can calculate this, because the size of EigenValue and substractdata are different?

And another question,

3rd ques: if we want to use the reduction data we should use the project?

4th ques: if we want to show the Principal Components (which is first and second columns of eigen vector), we have to plot that Principal Components along with the Data (initial data) or with substractdata?

cyon cyon · Accepted Answer · 2014-03-06T10:36:04

Your eigenvalue 3*3 matrix is a diagonal matrix. The eigenvalues are scalars along the diagonal. To reduce the dimensionality you pick the k=2 eigenvectors that correspond to the two largest eigenvalues. So you need to sort your eigenvectors based on their corresponding eigenvalues and pick the two that have the two largest eigenvalues.

So you will have EigenValue = 2*2 (only two eigenvalues) and EigenVector 3*2 after reduction.

Since your eigenvectors are now 3*2 you can project the data onto the 2-dim subspace using substractdata * eigenvector. You will need to add the mean back after reconstruction to show the data along with the principal components.

Principal Component Analysis?

2 Answers