Feature Extraction and Feature Selection

Source: Internet
Author: User
Both feature extraction and feature selection are the most effective features (immutability of similar samples, identification of different samples, and robustness to noise) from the original features.
Feature Extraction: it has obvious physical significance to convert original features into a group (Gabor, geometric features [corner points, immutations], texture [HSV hog]). or statistical significance or core feature selection: select a group of the most statistically significant features from the feature set to reduce the dimension: 1. reduce data storage and input data bandwidth 2. Reduce redundancy 3. Improve classification at low latitude 4. Discover more meaningful potential variables to help you gain a deeper understanding of the data

Linear Feature Extraction PCA-Principal Component Analysis: Find the optimal subspaces that represent data distribution (dimensionality reduction, can be irrelevant) in fact, it is the feature vector corresponding to the first s largest feature values of the covariance matrix. The ing matrix is described in a very intuitive and detailed article.
Principal Analysis (pcaw.theoretical Analysis and Application .doc
561.5 KB



Lda-idea of Linear Discriminant Analysis: Find the largest sub-space of the discriminant criterion. The Fisher concept is used to find a vector to minimize intra-class divergence and maximum inter-class divergence after dimensionality reduction; in fact, it is the first s feature vector of Sw-1Sb corresponding to the feature vector structure ing matrix DHS pattern classification book 96 pages have detailed derivation, easy to understand Reference Paper 1
ICA-Independent Component Analysis: PCA reduces the dimensionality of raw data and extracts irrelevant parts. ICA reduces the dimensionality of raw data and extracts independent attributes; find a linear transformation z = wx to maximize the independence of each component of Z, I (z) = ELN (P (Z)/P (Z1 ).. (P (ZD) view the derivation and calculation of machine learning a probabilistic perspective. (2)
Note: The problem of PCA & icapca is actually a base transformation, which makes the transformed data have the largest variance. The variance size describes the amount of information about a variable. When we talk about the stability of a thing, we often say that we need to reduce the variance. If a model has a large variance, the model is unstable. However, for the data we use for Machine Learning (mainly training data), the variance is significant. Otherwise, if the input data is the same vertex, the variance is 0, in this way, multiple input data is equivalent to one data.
ICA is used to find the mutually independent parts of the signal (without orthogonal), corresponding to the higher-order statistical analysis. ICA theory holds that the X of the hybrid data array used for observation is obtained by the linear weighting of the independent element S through. The objective of ICA theory is to obtain a separation matrix W through X, so that the signal y obtained by W acting on X is the optimal approximation of the independent source S. The relationship can be expressed in the following formula:
Y = wx = was, a = inv (W)
Compared with PCA, ICA can better characterize the random Statistical Characteristics of variables and suppress Gaussian noise.





Two-dimensional PCA Reference Paper 3




Typical corresponding analysis ideas of CCA-canonical correlaton analysis: find two groups of bases, this maximizes the projection correlation between the two sets of data on the basis of the two sets of data. It is used to describe the linear relationship between two high-dimensional variables. We can use PLS (partial least squares) to solve this problem.



Nonlinear Feature Extraction Kernel PCA Reference Paper 5
Kernel FDA references 6

Manifold Learning finding popular low-dimensional coordinates using the local structure of popular learning for dimensionality reduction methods: Isomap, lle, Laplacian eigenmap, LPP References 10

Summary of principles

Divided into three types: 1. Euclidean distance-based criterion (divergence matrix) 2. Probability distance-based criterion

3. entropy-based principle


Corresponding principles









References [1] Hua Yu and Jieyang, a direct LDA Algorithm for High-dimen1_data with Application to face recognition, pattern recognition volume 34, Issue 10, October 2001, pp.2067-2070 [2]. hyvarinenand E. oja. independent Component Analysis: algorithms and applications. neural Networks, 13 (4-5): 411-430,200 [3] J. yang, D. zhang, a.f. frangi, and j.y. yang, two-dimen1_pca: A New Approach to appearance-Based Face representation and recognition, IEEE Trans. on Pattern Analysis and machine intelligence, vol. 26, No. 1, pp. 131-137, Jan. 2004 [4] r. h. david, S. sandor and S. -T. john, canonical correlation analysis: An overview with Application to learning methods, technical report, CSD-tr-03-02,2003 [5] B. scholkopf,. smola, and K. r. muller. nonlinear component analysis as a kernel eigenvalue problem, neural computation, 10 (5): 1299-1319,199 8 [6] Mika, S ., ratsch, G ., weston, J ., scholkopf, B ., mullers, K. R ., fisher discriminantanalysis with kernels, neural networks for Signal Processing IX, Proceedings of the IEEE signal processing society workshop, pp. 41-48,199 9 [7] J. b. tenenbaum, V. de Silva, and J. c. langford, a global geometric framework for Nonlinear Dimensionality regression ction, science, 290, pp. 2319-2323,200 0 [8] Sam T. roweis, and Lawrence K. saul, Nonlinear Dimensionality allocation ction by Locally Linear embedding, science 22 December 2000 [9] Mikhail Belkin, Partha niyogi, Laplacian eigenmaps for dimensionality allocation ction and data representation, computation, 200 [10] Xiaofei he, Partha niyogi, locality preserving projections, advances in neural information processing systems 16 (NIPS 2003), Vancouver, Canada, 2003

Feature Extraction and Feature Selection

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.