ML: Descending dimension algorithm-lda

Last Update:2017-10-28 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

discriminant analysis (discriminant) is a classification technique. It uses a known class of "training samples" to establish the criteria and to classify the unknown categories of data by predictor variables. There are three kinds of discriminant analysis methods, namely Fisher Discriminant, Bayes discriminant and distance discrimination.

Fisher Discriminant thought is the projection dimensionality reduction, so that multidimensional problems can be simplified to one-dimensional problems to deal with. Select an appropriate projection axis so that all the sample points are projected onto this axis to get a projected value. The requirement for the direction of the projection axis is that the dispersion within the group formed by the projected values in each group is as small as possible, while the projected values between the different groups are as large as possible between classes.
Bayes discriminant theory is based on a priori probability to find the posterior probability, and based on the post-test probability distribution to make statistical inference.
Distance discriminant thinking is based on the known classification of the data to calculate the various kinds of center of gravity, the unknown classification of the data, calculate its distance from all kinds of center of gravity, and a certain center of gravity distance is attributed to this class

Linear discriminant Analysis (Linear discriminant, LDA) is a classical algorithm for pattern recognition, which was introduced in the field of pattern recognition and artificial intelligence in the 1996 by Belhumeur. The basic idea of LDA is to project the high-dimensional pattern sample to the best discriminant vector space, in order to achieve the effect of extracting the classified information and compressing the spatial dimension of feature space, and then ensuring that the model sample has the largest class distance and the smallest intra-class distance in the new subspace, that is, the model has the best separable in the space.

Feature selection (i.e. dimensionality reduction) is a very important step in data preprocessing. For classification, feature selection can select the most important features of the classification from a wide range of features, removing the noise from the original data. Principal component Analysis (PCA) and linear discriminant analysis (LDA) are two of the most commonly used feature selection algorithms. But their goal is basically the opposite, as shown below is the difference between LDA and PCA.

Start thinking differently. PCA is mainly from the covariance angle of features, to find a better projection mode, that is, the selection of the sample point projection has the largest variance direction, and LDA is more to consider the classification of the label information, looking for the projection after the different categories of data points between the greater distance and the same category of data points to minimize the distance, That is, the best way to select the classification performance.
Learning patterns are different. PCA is unsupervised learning, so most scenarios are only part of the data processing process and need to be used in combination with other algorithms, such as PCA and clustering, discriminant analysis, regression analysis and so on; LDA is a supervised learning method, which can be used to predict the application in addition to dimensionality reduction. Therefore, it can be combined with other models and can be used independently.
The number of available dimensions is different after dimensionality reduction. LDA can generate up to C-1 subspace space (category labels-1), so LDA has nothing to do with the number of original dimensions, only the number of data label classifications, and PCA is available in up to n dimensions, that is, the maximum available dimensions can be selected.

In the same example, the two methods of dimensionality reduction are very intuitive and result in different comparisons:

Linear discriminant Analysis The LDA algorithm has been widely used in many fields because of its simple validity, and it is a classic and popular algorithm in machine learning and data mining field. But the algorithm itself still has some limitations:

When the sample number is much smaller than the characteristic dimension of the sample, the distance between samples and sample becomes larger, the distance metric is invalidated, so that the discrete matrix of the intra-class and inter-classes in the LDA algorithm is singular and the optimal projection direction cannot be obtained, especially in the field of face recognition.
LDA is not suitable for dimensionality reduction of non-Gaussian distribution samples
LDA does not work well when the sample classification information depends on the variance instead of the mean value
LDA may over-fit data

Application Scenarios for LDA:

dimensionality reduction or pattern recognition in human face recognition
Economic forecasts based on the macroeconomic characteristics of the market
Market research based on market or user's different attributes
Predicting medical conditions based on patient case characteristics

Mass::lda

The

r uses the LDA function of the mass package for linear discriminant. The LDA function is based on Bayes discriminant theory. Bayes discriminant is equivalent to Fisher discriminant and distance discrimination when the classification has only two kinds and the population obeys multivariate normal distribution. code example: &NBSP;

 > if  (Require (MASS) == FALSE)  +< Span style= "COLOR: #000000" > { + install.packages ( " mass  "  )  +}  > > Model1=lda (Species~.,data=iris)  > table <-table (iris$species,predict (model1) $class   )  > table Setosa versicolor virginica setosa 50 0 0 versicolor 0  2 virginica 0  1 49> sum (diag (prop.table (table))) ## #判对率  [1] 0.98

as a result, only three of the samples were observed to be judged incorrectly. After the discriminant function is established, the discriminant score can be plotted similar to the principal component analysis.

> LD <-predict (Model1) $x #表示映射到模型中的向量上的值; score value > DS <-Cbind (Iris,as.data.frame (LD))>Head (DS) sepal.length sepal.width petal.length petal.width species LD1 LD21 5.1 3.5 1.4 0.2 setosa 8.061800 0.30042062 4.9 3.0 1.4          0.2 setosa 7.128688-0.78666043 4.7 3.2 1.3 0.2 setosa 7.489828-0.26538454 4.6 3.1 1.5 0.2 setosa 6.813201-0.67063115 5.0 3.6 1.4 0. 2 Setosa 8.132309 0.51446256 5.4 3.9 1.7 0.4 setosa 7.701947 1.4617210> p=ggplot ( ds,mapping = AES (x=ld1,y=LD2))> P+geom_point (AES (colour=species), alpha=0.8,size=3)

And look at a group of predictive data based on principal components.

> Model2 <-lda (species~ld1+ld2,ds)> table (iris$species,predict (MODEL2) $Class)                         Setosa versicolor virginica  setosa                   0         0  versicolor      0                  2   virginica       0          1        49

when the covariance matrices of different class samples are not the same, two discriminant should be used. Note When using the LDA and QDA functions: The hypothesis is that the population obeys a multivariate normal distribution and, if not satisfied, uses two times of discretion.

> Iris.qda=qda (species~.,data=iris,cv=T)> Table<-table (iris$species,predict (Iris.qda,iris) $ class )> table                         setosa versicolor virginica  setosa                   0         0  versicolor      0                  2  virginica       0          1        49> sum (diag (prop.table (table)) # ## # Rate of Judgement [1] 0.98

The CV parameter is set to T, which is used to leave a cross-check (Leave-one-out cross-validation), and the predicted value is automatically generated. The confusion matrix generated under this condition is more reliable. You can also use the Predict (model) $posterior to extract the posteriori probabilities

ML: Descending dimension algorithm-lda

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

ML: Descending dimension algorithm-lda

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

ML: Descending dimension algorithm-lda

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support