next up previous index
Next: Discriminant Analysis Up: Principal Components Previous: Example of the Power   Index

Principal Components

A quick summary, all is detailed below:

Maximization of $u^\prime Au$ with $u^\prime Mu=1$

1) $X$ centered, all points (observations) same weight.

\begin{displaymath}
1 \, X=0\quad\mbox{and}\quad x_{ij} = \sum^r_{t=1}x_{it}s_t v_{jt},
\end{displaymath}

$p$ variables can be replaced by the $r$ columns of $v$.

\begin{displaymath}
x^{(k)}_{ij} = \sum^k_{t=1}u_{it}s_t v_{jt}
\end{displaymath}

is the best (optimal) approximation of rank $k$ for $X$.

The columns of $V$ are the directions along which the variances are maximal.

Definition: Principal components are the coordinates of the observations on the basis of the new variables (namely the columns of $V$) and they are the rows of $G=XV=US$. The components are orthogonal and their lengths are the singular values $C^\prime C=SU^\prime US=S^2$.

In the same way the principal axes are defined as the rows of the matrix $Z=U^\prime X=SV^\prime$. Coordinates of the variables on the basis of columns of $U$.

\begin{displaymath}
Z=S^{-1}C^\prime X \quad\mbox{and}\quad G=XZ^\prime S^{-1}.
\end{displaymath}

However, this decomposition will be highly dependent upon the unity of measurement (scale) on which the variables are measured. It is only used, in fact, when the $X_{\cdot k}$ are all of the same ``order''. Usually what is done is that a weight is assigned to each variable that takes into account its overall scale. This weight is very often inversely proportional to the variable's variance. So we define a different metric between observations instead of

\begin{eqnarray*}
d\,\left( x_{i\cdot}, x_{j\cdot}\right) &=& \left( x_{i\cdot}-...
...2_1}\\
& \ddots\\
& & \frac{1}{\sigma^2_p}
\end{array}\right).
\end{eqnarray*}



The same can be said of the observations; some may be more ``important'' than others (resulting from a mean of a group which is larger).


2) General PCA (X,Q,D)

$X$ centered with respect to $D$ :$X D\,1_n=0$.

Generalized singular value decomposition.

\begin{displaymath}
\begin{array}{lcrclcr}
X & = & USV^\prime & \mbox{with} & V^\prime QV & = & I\\
&&&& U^\prime DU & = & I.
\end{array}\end{displaymath}

Practical Computation:

\begin{eqnarray*}
B &=& D^{\frac{1}{2}}X\, Q^{\frac{1}{2}} = ESA^\prime\quad A^\...
...\
V &=& D^{-\frac{1}{2}}A\\ [2ex]
X^{(K)} &=& US^{(K)}V^\prime.
\end{eqnarray*}



Principal Components are the columns of: $C=US=XQV$.


$V$ eigenvectors of $X^\prime DXQ$.

\begin{displaymath}Q=
\left(
\begin{array}{ccc}
\frac{1}{\sigma^2_1}\\
& \ddots\\
& & \frac{1}{\sigma^2_p}
\end{array}\right).
\end{displaymath}

$V$ eigenvectors of correlation matrix.


next up previous index
Next: Discriminant Analysis Up: Principal Components Previous: Example of the Power   Index
Susan Holmes 2002-01-12