0
votes

I am strugling with PCA stuff.

So for example I have :

Data=100*3
substractdata=data-mean (the size will be same 100*3)
covariance=3*3
EigenVector=3*3
EigenValue=3*3

And to do reduction to our data, we have to eliminate the number of eigen value and eigen vector based on k

For example k=2

so the number of

  • EigenValue will become 2*2
  • EigenVector = 2*2

1st ques: is that right?

And then we have to project out matrix

project=EigenVector (which is 2*2) *substractdata (100*3)

2nd ques: How we can calculate this, because the size of EigenValue and substractdata are different?

And another question,

3rd ques: if we want to use the reduction data we should use the project?

4th ques: if we want to show the Principal Components (which is first and second columns of eigen vector), we have to plot that Principal Components along with the Data (initial data) or with substractdata?

2

2 Answers

0
votes

Your eigenvalue 3*3 matrix is a diagonal matrix. The eigenvalues are scalars along the diagonal. To reduce the dimensionality you pick the k=2 eigenvectors that correspond to the two largest eigenvalues. So you need to sort your eigenvectors based on their corresponding eigenvalues and pick the two that have the two largest eigenvalues.

So you will have EigenValue = 2*2 (only two eigenvalues) and EigenVector 3*2 after reduction.

Since your eigenvectors are now 3*2 you can project the data onto the 2-dim subspace using substractdata * eigenvector. You will need to add the mean back after reconstruction to show the data along with the principal components.

0
votes

Let X denote the original normalized 100x3 data matrix. Then the decomposition is X'*X=V*D*V', where V is an orthogonal and D a diagonal 3x3 matrix. By some magic U=X*V is a matrix with orthogonal columns and X=U*S*V' where S (the diagonal matrix of singular values) is the square root of D. This is also called the singular value decomposition and can be directly computed, without forming the (numerically bad) product X'*X.

Now you want the first two columns of U (all relevant libraries return D resp. S with descending diagonal entries). Using the SVD you have direct access to them, using the eigenvalue decomposition U12=X*V12, that is, as per cyon, the submatrix U12 of U containing the first two columns (left singular vectors) is obtained from the submatrix V12 of V containing the first two columns (right singular vectors) of V.