Principle of PCA and application of face recognition __PCA

Source: Internet
Author: User

This paper introduces the principle of PCA in detail, mainly refer to PRML book.
PCA is also called Karhunen-loève transform (KL transform), or hotelling transform (hotelling transformation), is a unsupervised learning method, which is often used in dimensionality reduction of high-dimensional data, and transforms the original data into a group of linear independent representations through linear transformations. , which can be used to extract the main characteristic components of the data.
The principle of PCA has two kinds of equivalent explanations: maximum variance and minimum projection error, both of which are projected into the low dimensional linear subspace by a set of orthogonal projections, i.e. the primary element subspace, the maximum variance emphasizes the data projection, and the maximum variance is maintained in the direction of projection. The minimum projection error requires the minimum mean variance between the reconstructed data and the original data, which was presented by Hotelling in 1933 and the latter by Pearson in 1901. 1. Maximum Variance method

The intuitive understanding of the maximum variance can be explained from a simple example, such as our palm in the light of the projection, the palm is a three-dimensional structure, belong to three-dimensional space, the formation of the shadow in a plane, belong to two-dimensional space, first reached the goal of dimensionality reduction, when the hands perpendicular to the light, the full hand shadow can be irradiated on the ground, This direction can retain the maximum characteristics of the hand, if the palm upright, parallel to the light, the shadow on the ground is a very thick line, can not determine what is the object. Again, as in Figure 1, two datasets are generated from two Gaussian distributions, as can be seen from the graph, the projection of data on line B keeps the cluster structure of two kinds of data, and the projection variance of line B is larger, and the variance is the measure of data scatter degree, so the projection direction of variance is advantageous to keep the clustering characteristic of the data. The low Wizi space after data projection is called the primary element subspace, the complement space of the main element subspace relative to the original space is the residual subspace, the main element subspace requirement can keep the main characteristics of the data, the variance as the measure of the data scatter degree, is the important statistic that decides the direction of the projection, the following is deduced to deepen the understanding of the maximum variance method.

Fig. 1 Projection comparison of data in different directions

Given a set of D-dimensional data {xn}nn=1 \left\{{x_n}} \right\}_{n = 1}^n, xn∈rd {x_n} \in {R^d}, our goal is to project the data into M-m-dimensional space (m<d) (m), which maximizes the variance of the data after projection 。 First consider a dimension, assuming that u1∈rd {u_1} \in {r^d} is a projection direction, without losing generality, so that U1 {u_1} is the unit vector, that is, ut1u1=1 u_1^t{u_1} = 1, any point xn {x_n} The projected data is a scalar ut1xn u_1^t{ X_n}, the original data mean x¯=∑n=1nxn \bar x = \sum\limits_{n = 1}^n {{X_n}}, the mean value of the projection data is ut1x¯u_1^t\bar{x}, and the projection data covariance is 1n∑n=1n (

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.