Principle and difference of the dimensionality reduction of LDA and PCA

Source: Internet
Author: User

The main advantages of the LDA algorithm are:

    • prior knowledge of classes can be used in the dimensionality reduction process, while unsupervised learning such as PCA cannot use class priori knowledge.
    • LDA is better than the PCA algorithm when it relies on the mean value instead of the variance in the sample classification information.

The main drawbacks of the LDA algorithm are:

    • LDA is not suitable for the reduction of non-Gaussian distribution samples, PCA also has this problem.
    • LDA is reduced to the dimensionality of the class number k-1, and LDA cannot be used if the dimension of our dimensionality is greater than k-1. There are, of course, some LDA's evolutionary versions of algorithms that can circumvent this problem.
    • LDA does not work well when it relies on the variance rather than the mean value of the sample classification information.
    • LDA may over-fit data.

The main advantages of PCA algorithm are:

    • It is only necessary to measure the amount of information by variance and not be affected by factors other than the data set.
    • The orthogonal between the main components can eliminate the factors that influence the interaction between the original data components.
    • The calculation method is simple, and the main operation is eigenvalue decomposition, which is easy to realize.
    • When the data is affected by noise, the characteristic vectors corresponding to the minimum eigenvalues are often related to noise, and discard can play a role in reducing noise to some extent.

The main drawbacks of PCA algorithms are:

    • The meaning of each characteristic dimension of principal component has certain fuzziness, which is not better than the interpretation of original sample features.
    • A non-principal component with a small variance may also contain important information about the differences in the sample, since the reduction of dimensionality may have an impact on subsequent data processing.

LDA and PCA

Same point:

    • Both can reduce the dimensionality of the data.
    • Both of them use the idea of matrix feature decomposition in dimensionality reduction.
    • Both assume that the data conforms to the Gaussian distribution.

Different points:

    • LDA is a supervised dimensionality reduction method, and PCA is a non-supervised dimensionality reduction method. (LDA input data is tagged, PCA input data is non-labeled)
    • The LDA dimensionality decreases to the dimension of the k-1 of the class number, and the PCA does not have this limitation. (PCA uses the characteristic vectors corresponding to the largest features to perform the dimensionality reduction process.) The number of dimensions to be reduced and the maximum number of features selected)
    • LDA can also be used for classification , in addition to dimensionality reduction. (To obtain a new sample data after dimensionality reduction, to determine an unknown sample belongs to that class, the same linear transformation of the sample, according to its projection to the location to be divided (discriminant analysis problem?) ))
    • LDA chooses the best projection direction for the classification performance, while the PCA selects the sample point projection with the direction of the maximum variance.

Principle and difference of the dimensionality reduction of LDA and PCA

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.