Show pageOld revisionsBacklinksExport to PDFBack to top This page is read only. You can view the source, but not change it. Ask your administrator if you think this is wrong. ====== Principal component analysis ====== Principal component analysis is essentially boiling down multidimensional data with a lot of dimensions (aka columns) into a few dimensions while keeping **most** of the information. Given $n$ $m$-dimensional vectors, steps to find the top $k$ principal components: - Calculate the component-wise average of all of the vectors $\bar{x} = \frac{1}{n} \sum_{i=1}^n x_i $ - Form $m \times m$ matrix $S = \frac{1}{n}\sum_{i=1}^n (x_i - \bar{x})(x_i - \bar{x})^T $ - Calculate the $m$-dimensional eigenvectors associated with the largest $k$ eigenvalues of $S$: $v_1, ... v_k$ associated with $\lambda_1, ..., \lambda_k$ - The $k$ dimensional representation of $x_i$ is then $\hat{x}_i = (x_i^Tv_1, ... x_i^Tv_k)$ Another way to state the objective: $$ \min \sum_{i=1}^n || x_i - \hat{x}_i ||^2 $$ $$ \max \sum_{i=1}^n || \hat{x}_i ||^2 $$ kb/principal_component_analysis.txt Last modified: 2024-04-30 04:03by 127.0.0.1