Linear Discriminant Analysis (LDA)

Source: Internet
Author: User
I. Basic Thoughts of LDA

Linear Discriminant Analysis (LDA), also known as Fisher linear discriminant, is a classic algorithm for pattern recognition, it introduced Pattern Recognition and AI by belhumeur in 1996. The basic idea of linear discriminant analysis is to project a high-dimensional Pattern sample to the optimal identification.Vector Space,Achieve the effect of extracting classification information and compressing feature space dimensionsAfter projection, ensure that the Pattern sample hasMaximum inter-class distanceAndMinimum intra-class distanceThat is, the pattern has the best separation in the space.

As shown in, people are divided into white and black people based on their skin color and nose color. The white people's nose and skin color in the sample are mainly concentrated in group, black people's noses and skin colors are mainly concentrated in group B. Obviously, Group A and Group B are obviously separated in space, place the points in Group A and Group B on the linear L, respectively, in different regions of the linear L, so that the black and white are linearly separated. Once an unknown sample needs to be differentiated, you only need to input the skin color and the height of the nose into the linear equation L to determine the classification of the unknown sample.

Therefore, the key step of LDA is to select an appropriate Projection Direction, that is, to establish an appropriate linear discriminant function (non-linear is not the focus of this Article ).

Ii. LDA calculation process

1. algebraic computation process

Two general A and B are known, and m features are proposed in General A and B respectively. Then, samples are extracted from General A and B respectively, the sample data of A and B is as follows:

And

If such a linear function (projection plane) exists, the samples A and B can be projected onto the plane, the projection of samples A and B on the straight line satisfies the following two points: (1) the center distance of the two types of samples is the farthest; (2) All projections within the same sample are the closest. The linear function is expressed as follows:

Project the first sample point of a population to the plane to obtain the projection point, that is

The center of gravity of a population in the plane projection is

Where

Similarly, we can obtain the projection point of B on the plane.

And the center of gravity of population B in the plane projection is

Where

According to Fisher's idea, projection points of different general a and B should be separated as much as possible and expressed as mathematical expressions, while the distance between projection points of the same general should be as small as possible, expressed as, in a mathematical expression.

So that the maximum value is obtained, and you can evaluate and export them separately. The detailed steps are not shown.

2. calculation process of vector Representation

The Algebra Expression method identified by LDA is intuitive and complex, and more than two samples are not applicable. Vectors and matrices contain more information and provide better expression ability, this section describes the calculation process of LDA vector representation.

Assume that there is a sample in a dimension space, that is, each sample is a matrix of rows, which indicates the number of samples belonging to the class. Assume that there is a class C.

The agreed mathematical symbols and expressions are as follows:

-- Number of samples in the class;

-- Mean of all samples;

-- Sample mean of the class;

--Class degree matrix, covariance matrix;

-- Intra-class discretization matrix;

--Sum of the class discretization degrees of all classes;

-- Number of samples in the class;

We can know that the actual meaning of a matrix is a covariance matrix, which depicts the relationship between the class and the sample population, the functions on the diagonal of the matrix represent the variance (dispersion) of the class relative to the population ), the elements on the non-diagonal lines represent the covariance of the population mean of the class (that is,CorrelationOrRedundancy), So calculate the total covariance matrix between the sample and the population based on the class to which the samples belong.A macro description of the degree of discrete redundancy between all classes and the population. Similarly, for the sum of covariance matrices between samples and classes in a classification, it depictsIn general, there are various samples and classes in the class.(The class features described here are composed of the average matrix of each sample in the class)DiscretizationIn fact, it can be seen that both the sample expectation matrix in the class and the overall sample expectation matrix act as a medium. Both the class and the inter-class discretization matrix are characterized from a macro perspective.Sample discretization between classesAndDiscretization between samples in class and Samples.

As a classification algorithm, lda certainly requires a low Coupling Degree between classes and a high degree of aggregation within classes. That is, the values in the class discretization matrix are small, the value in the matrix between classes and divergence should be large, so the classification effect is good. Here we introduce the fisher identification criterion expression:

 

AnyDimension column vector. Fisher Linear Identification analysis is to selectTo the maximum valueAs the Projection Direction, the physical meaning is that the projected sample hasMaximum class degreeAndMinimum class discretization. We can substitute the formula and formula into the formula to obtain:

 

Important(Add 100 rough characters to it, haha) and set the matrix, which can be viewedA space (equivalent to in an algebraic expression)That is to say, it refers to the projection of a low-dimensional space (hyperplane) consisting of a matrix, which associates with the feature value expression. It can also be expressed as, while when the sample is a column vector, it indicatesSquare of Geometric Distance. Therefore, the numerator of the Fisher Linear Identification Analysis expression is the sum of squares of the Geometric Distance between classes in the projection space, similarly, the denominator can also be introduced as the square difference of the Geometric Distance of the sample in the class under the projection space, so the classification problem is converted to finding a low-dimensional space so that the sample is projected into the space, the projected sum of distance and sum of distance between classes has the largest ratio to the sum of intra-class distance, that is, the optimal classification effect.

Therefore, based on the above idea, we can optimize the following criterion functions.Find the projection matrix composed of a set of optimal identification vectors(Here we can also see that 1/m can be reduced by the molecular denominator, so the first group formula mentioned above has the same effect as the second group formula ).

It can be proved that when the LDA algorithm is implemented, a dimensionality reduction of the PCA algorithm is performed on the sample to eliminate the redundancy of the sample, thus ensuring non-singular arrays, of course, even if it is a singular array, it can be resolved. It can be divided into two types. We will not discuss them here, if they are all non-singular cases, the column vector of the optimal projection matrix is exactly the feature vector corresponding to the D largest feature values of the generalized feature equation (Matrix feature vectors), And the number of the optimal projection axes D <= C-1.

Based on release

Because

The alternative formula is available:

According to the formula, to maximize Max, we can obtain the following conclusion: the column vector of the projection matrix is the feature vector corresponding to the maximum feature values of D (self-Fetch ).

Iii. application instance analysis of LDA

The mass package in the r language contains the LDA function. The function name is lda. For specific application examples, see the article:

Http://xccds1977.blogspot.tw/2011/12/r_27.html

------------------------------------

Most of the content in this article comes from the Internet and serves as my learning notes only.

Linear Discriminant Analysis (LDA)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.