Algorithm Overview
Principal Component Analysis (PCA) is a common method for processing, compressing, and extracting information based on the variable covariance matrix. It is mainly used for dimensionality reduction of features.
Algorithm hypothesis
The probability distribution of data satisfies the Gaussian distribution or exponential probability distribution. A vector with a high variance is considered as the principal component.
Algorithm input
Data Set Containing N records
Algorithm output
Dataset after dimensionality reduction or compression
Algorithm IDEA
? 1. Calculate the mean m and covariance matrix s of all samples ;? 2. Calculate the feature values of S and sort them in ascending order ;? 3. Select the feature vector corresponding to the first n feature values to form a transformation matrix E = [E1, E2 ,..., En '];? 4. Finally, for each n-dimensional feature vector X, it can be converted to a n-dimensional new feature vector.
Y = transpose (E) (X-m)
WEKA running result
The running result of weather. Nominal. ARFF is as follows:
Algorithm Application
Face Recognition
Image Compression
Signal Denoising
For Original Articles, please indicate the source. Thank you.