Feature selection (dimensionality reduction) linear discriminant analysis (LDA)

Source: Internet
Author: User

Previously, LDA was used to classify, and PCA was used for dimensionality reduction. The dimensionality reduction of PCA is to reduce the amount of subsequent computations, and the ability to distinguish different classes is not improved. PCA is unsupervised, and LDA is able to project different classes in the best direction, so that the distance between the two categories is the largest, to achieve easy-to-distinguish purposes, LDA is supervised. The following blog post is a good account of the principles of LDA's algorithm and is well worth reading. Original address: https://www.cnblogs.com/kemaswill/archive/2013/01/27/2879018.html

===========================================================================

Feature selection (i.e. dimensionality reduction) is a very important step in data preprocessing. For classification, feature selection can select the most important features of the classification from a wide range of features, removing the noise from the original data. Principal component Analysis (PCA) and linear discriminant analysis (LDA) are two of the most commonly used feature selection algorithms. For an introduction to PCA, see my other blog post. This paper mainly introduces the linear discriminant analysis (LDA), which is based mainly on Fisher discriminant analyses with Kernals[1] and Fisher Linear discriminant Analysis[2] two literatures.

One of the major differences between LDA and PCA is that LDA is a supervised algorithm, and PCA is unsupervised, because the PCA algorithm does not consider the label (category) of the data, but maps the original data to some direction (base) of the larger variance. The LDA algorithm considers the label of the data. Literature [2] cited a very graphic example, which shows that in some cases, the performance of the PCA algorithm is poor, as shown below:

We c1,c2 two different categories of data in different colors. According to the PCA algorithm, the data should be mapped to the direction of the largest variance, that is, the y-axis direction, but if mapped to the y-axis direction, c1,c2 two different categories of data will be completely mixed together, it is difficult to distinguish the open, so the use of PCA algorithm to reduce the dimension and then classify the effect is very poor. However, using the LDA algorithm, the data is mapped to the x-axis direction.

The LDA algorithm takes into account the categorical attributes of the data, given two categories C1, C2, we want to find a vector ω, when the data is mapped to the direction of Ω, the data from two classes is as separate as possible, and the data within the same class is as compact as possible. The mapping formula for the data is: Z=ΩTX, where z is the projection on the data x to Ω, and thus is a dimension of D-to-1-dimensional attribution.

Let M1 and M1 respectively represent the C1 class data projection before the mean value, easy to know M1=ΩTM1, similarly m2=ωtm2

The S12 and S22 respectively represent the scattering of C1 and C2 data after projection (scatter), i.e. s12=∑ (ΩTXT-M1) 2rt,s22=∑ (ωtxt-m2) 2 (1-RT) where xt∈c1, rt=1, otherwise rt=0.

We want |m1-m2| as large as possible, and s12+s22 as small as possible, and Fisher linear discriminant is to maximize the following ω:

J (ω) = (m1-m2) 2/(S12+S22) formula-1

Rewrite the numerator in Equation 1: (m1-m2) 2 = (ΩTM1-ΩTM2) 2=ωt (m1-m2) (m1-m2) tω=ωtsbω

where sb= (m1-m2) (m1-m2) T-2

is the inter-class scatter matrix (between class scatter matrix).

Rewrite the denominator in Equation 1:

∑ (ΩTXT-M1) 2rt=∑ωt (XT-M1) (XT-M1) tωrt=ωts1ω, where S1=∑rt (XT-M1) (XT-M1) T is C1 's in-Class scatter matrix (within class scatter matrix).

Make Sw=s1+s2, is the sum of the scattered within the class, then s12+s22=ωtswω.

So the equation 1 can be rewritten as:

J (ω) = (ωtsbω)/(ω

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.