Using PCA in OpenCV

Source: Internet
Author: User

For PCA, has always been a concept, no actual use, today finally the actual use of a, found that PCA is quite magical.


The use of PCA in OpenCV is simple, as long as several statements are available.


1. Initialize data


Each row represents a sample


cvmat* PData = Cvcreatemat (total number of samples, number of dimensions per sample, CV_32FC1);


cvmat* Pmean = Cvcreatemat (1, Dimensions of the sample, CV_32FC1);


Each number in the peigvals represents a characteristic value


cvmat* peigvals = Cvcreatemat (1, MIN (total number of samples, number of dimensions of the sample), CV_32FC1);


Each row represents a feature vector


cvmat* peigvecs = Cvcreatemat (min (total number of samples, number of dimensions of the sample), dimensions of the sample, CV_32FC1);


2, PCA processing, calculated average vector pmean, eigenvalue peigvals and eigenvector peigvecs


CVCALCPCA (PData, Pmean, Peigvals, Peigvecs, Cv_pca_data_as_row);


3, select the front p eigenvector (main component), and then the projection, the results are stored in the Presult, Presult contains p coefficients


cvmat* PResult = Cvcreatemat (total sample count, PCA-transformed sample dimension (i.e. number of principal components), CV_32FC1);


CVPROJECTPCA (PData, Pmean, Peigvecs, PResult);


4, reconstruction, the results are stored in the Precon


cvmat* Precon = Cvcreatemat (total number of samples, number of dimensions per sample, CV_32FC1);


CVBACKPROJECTPCA (PResult, Pmean, Peigvecs, Precon);


5. Calculation of reconstruction Error


Calculate the "difference" between Precon and pdata.


If you want to use PCA to judge the "right" problem, you can first use positive samples to calculate the main components, judgment, the need to judge the data projection, and then reconstruct, calculate the reconstructed data and the original data differences, if the difference within a given range, can be considered "yes".


If the PCA is used to classify, for example, the number is classified, then all the data (0-9 of all samples) to calculate the principal component, then the projection of each type of data, the calculation of the projection coefficient, can be easily averaged. The average coefficient is calculated for each class. Classification, the data will need to be classified to be projected, to obtain the coefficients, and the previous calculation of the average coefficient of each class to compare, can be sentenced to the nearest class. Of course, it's just the simplest way to use it.


*********************************************************************************************




*********************************************************************************************


PCA is the principal component analysis, mainly used for data dimensionality reduction, for a series of sample feature composed of multi-dimensional vectors, some elements of the multidimensional vector itself is not distinguishable, such as an element in all of the sample is 1, or with 1 gap, then the element itself is not distinguishable, Using it to differentiate between features, the contribution will be very small. So our goal is to find those big change elements, that is, the large variance of those dimensions, and to remove those small changes in the dimension, so that the feature left are "fine", and the calculation is also smaller.


For a k-dimensional feature, the equivalent of each dimension of its feature is orthogonal to the other dimensions (the axes are perpendicular to the multidimensional coordinate system), then we can change the coordinate systems of these dimensions so that the feature is larger on some dimensions, and the difference is small on some dimensions. For example, an ellipse with a 45-degree tilt, in the first coordinate system, if projected by x, y coordinates, the properties of those points whose attributes are very difficult to differentiate, because they have a similar variance in coordinates on the x, Y axis, we cannot judge which of these points is based on an X attribute of the point, and if the axes are rotated, The ellipse long axis is the x axis, then the ellipse on the long axis distribution is long, the variance is big, but on the short axis distribution short, the variance is small, therefore may consider only retains these points the long axis attribute, distinguishes the ellipse the point, thus, distinguishes the method to be better than the X, Y axis!


So our approach is to obtain a projection matrix of K-dimensional features, which can reduce feature from high to low dimension. A projection matrix can also be called a transformation matrix. The new low-dimensional features must be orthogonal to each dimension, and eigenvectors are orthogonal. By finding the covariance matrix of the sample matrix and then finding the eigenvectors of the covariance matrix, these eigenvectors can form the projection matrix. The selection of eigenvectors depends on the size of the eigenvalues of the covariance matrix.


To give an example:


For a training set, 100 samples, characterized by 10 dimensions, it is possible to create a 100*10 matrix as a sample. In order to find the covariance matrix of this sample, we get a 10*10 covariance matrix, then we can find the eigenvalues and eigenvectors of this covariance matrix, we should have 10 eigenvalues and eigenvectors, we take the characteristic vectors corresponding to the first four eigenvalues according to the size of eigenvalues, and form a 10*4 matrix. This matrix is the characteristic matrix that we require, the sample matrix of 100*10 multiplied by this 10*4 's characteristic matrix, we get a new matrix of 100*4 after dimensionality reduction, and the dimensionality of each sample drops.


When given a test feature set, such as 1*10-dimensional features, multiplied by the resulting 10*4 feature matrix, you can get a 1*4 feature, use this feature to classify.


So the PCA is actually to obtain the projection matrix, with high-dimensional features multiplied by the projection matrix, you can reduce the dimension of the high-dimensional feature to the specified dimension.


In OpenCV there is a special function that can get this projection matrix (characteristic matrix).


void CVCALCPCA (const cvarr* data, cvarr* avg, cvarr* eigenvalues, cvarr* eigenvectors, int flags);




*********************************************************************************************
*********************************************************************************************


float* Features=new Float[lenoffeatures];


...


cvmat* Vector_feature=cvcreatemat (M_PM.NUMOFSAMPLES,M_PM.DIM,CV_32FC1);


Cvsetdata (Vector_feature,features,vector_feature->step);


Cvmat *avgvector;


Cvmat *eigenvector;


Cvmat *eigenvalue_row;


cvmat* Vector_pca=cvcreatemat (M_PM.NUMOFSAMPLES,PCA_DIM,CV_32FC1);


Avgvector=cvcreatemat (1,M_PM.DIM,CV_32FC1);


Eigenvalue_row=cvcreatemat (1,min (m_pm.dim,m_pm.numofsamples), CV_32FC1);


Eigenvector=cvcreatemat (min (m_pm.dim,m_pm.numofsamples), M_PM.DIM,CV_32FC1);


Calculates the eigenvalues, transforms the original data, and obtains the principal component


CVCALCPCA (Vector_feature,avgvector,eigenvalue_row,eigenvector,cv_pca_data_as_row);


CVPROJECTPCA (VECTOR_FEATURE,AVGVECTOR,EIGENVECTOR,VECTOR_PCA);


Delete[] features;

Using PCA in OpenCV

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.