Principal Component Analysis

In order to maximize variance, the first weight vector thus has to satisfy

Since has been defined to be a unit vector, it equivalently also satisfies

The quantity to be maximised can be recognised as a Rayleigh quotient. A standard result for a positive semidefinite matrix such as is that the quotient's maximum possible value is the largest eigenvalue of the matrix, which occurs when w is the corresponding eigenvector.

Let be origin samples, projection vector satisfies

Denote the variance of origin samples, variance of projected samples,

Lagrangian multiplier form,

leads to

the variance is maximized with is the eigenvector of , the maximum value is the related eigenvalue .

PCA is either done by

  • singular value decomposition of a design matrix;
  • eigenvalue decomposition on the covariance matrix.

Eigenvalue Decomposition

A (non-zero) vector is an eigenvector of a square matrix if it satisfies the linear equation

where termed the eigenvalue corresponding to .

Since ,

Diagonalisable matrice can be factorized as

where and is the -th column of .

Singualar Value Decomposition

The singular value decomposition (svd) of matrix

where , and is diagonal with positive number.

The right singular vectors of the eigenvectors of , while the singular values of are equal to the square-root of the eigenvalues of .