[Pattern recognition] K-Nearest Neighbor classification algorithm KNN

Source: Internet
Author: User

K-Nearest Neighbors (KNN) is a well-understood classification algorithm. Simply put, it is to find K samples with the closest similarity from the training samples, then, if there are many samples in the K samples, the value to be determined (or sampling) belongs to this category.

KNN algorithm steps

  • Calculates the distance between each vertex and the current vertex in a known data set;
  • Select K points with the minimum distance from the current point;
  • Count the frequency of samples in each category in the first K points;
  • Return the category with the highest frequency of occurrence of the first K points as the prediction category of the current point.

Using CvKNearest class in CvKNearestOpenCV in OpenCV can implement simple KNN training and prediction.
int main(){float labels[10] = {0,0,0,0,0,1,1,1,1,1};Mat labelsMat(10, 1, CV_32FC1, labels);cout<<labelsMat<<endl;float trainingData[10][2];srand(time(0)); for(int i=0;i<5;i++){trainingData[i][0] = rand()%255+1;trainingData[i][1] = rand()%255+1;trainingData[i+5][0] = rand()%255+255;trainingData[i+5][1] = rand()%255+255;}Mat trainingDataMat(10, 2, CV_32FC1, trainingData);cout<<trainingDataMat<<endl;CvKNearest knn;knn.train(trainingDataMat,labelsMat,Mat(), false, 2 );// Data for visual representationint width = 512, height = 512;Mat image = Mat::zeros(height, width, CV_8UC3);Vec3b green(0,255,0), blue (255,0,0);for (int i = 0; i < image.rows; ++i){for (int j = 0; j < image.cols; ++j){const Mat sampleMat = (Mat_<float>(1,2) << i,j);Mat response;float result = knn.find_nearest(sampleMat,1);if (result !=0){image.at<Vec3b>(j, i)  = green;}else  image.at<Vec3b>(j, i)  = blue;}}// Show the training datafor(int i=0;i<5;i++){circle(image, Point(trainingData[i][0],  trainingData[i][1]), 5, Scalar(  0,   0,   0), -1, 8);circle(image, Point(trainingData[i+5][0],  trainingData[i+5][1]), 5, Scalar(255, 255, 255), -1, 8);}imshow("KNN Simple Example", image); // show it to the userwaitKey(10000);}

The example in the previous BP neural network is used. The classification result is as follows:
The prediction function find_nearest () has some other parameters besides the sample parameter:
float CvKNearest::find_nearest(const Mat& samples, int k, Mat* results=0, const float** neighbors=0, Mat* neighborResponses=0, Mat* dist=0 )


That is, samples is the floating point matrix of the number of features *; K is the number of closest points; results and prediction results; neibhbors is a pointer array of k * number of samples (the input is const, I really don't know why); neighborResponse is the output value of k neighboring neighbors of each sample in the number of samples * K; dist is the distance of k neighboring neighbors of each sample in the number of samples * K. Another example is OpenCV refman, which provides a similar example using input parameters in CvMat format:
int main( int argc, char** argv ){const int K = 10;int i, j, k, accuracy;float response;int train_sample_count = 100;CvRNG rng_state = cvRNG(-1);CvMat* trainData = cvCreateMat( train_sample_count, 2, CV_32FC1 );CvMat* trainClasses = cvCreateMat( train_sample_count, 1, CV_32FC1 );IplImage* img = cvCreateImage( cvSize( 500, 500 ), 8, 3 );float _sample[2];CvMat sample = cvMat( 1, 2, CV_32FC1, _sample );cvZero( img );CvMat trainData1, trainData2, trainClasses1, trainClasses2;// form the training samplescvGetRows( trainData, &trainData1, 0, train_sample_count/2 );cvRandArr( &rng_state, &trainData1, CV_RAND_NORMAL, cvScalar(200,200), cvScalar(50,50) );cvGetRows( trainData, &trainData2, train_sample_count/2, train_sample_count );cvRandArr( &rng_state, &trainData2, CV_RAND_NORMAL, cvScalar(300,300), cvScalar(50,50) );cvGetRows( trainClasses, &trainClasses1, 0, train_sample_count/2 );cvSet( &trainClasses1, cvScalar(1) );cvGetRows( trainClasses, &trainClasses2, train_sample_count/2, train_sample_count );cvSet( &trainClasses2, cvScalar(2) );// learn classifierCvKNearest knn( trainData, trainClasses, 0, false, K );CvMat* nearests = cvCreateMat( 1, K, CV_32FC1);for( i = 0; i < img->height; i++ ){for( j = 0; j < img->width; j++ ){sample.data.fl[0] = (float)j;sample.data.fl[1] = (float)i;// estimate the response and get the neighbors’ labelsresponse = knn.find_nearest(&sample,K,0,0,nearests,0);// compute the number of neighbors representing the majorityfor( k = 0, accuracy = 0; k < K; k++ ){if( nearests->data.fl[k] == response)accuracy++;}// highlight the pixel depending on the accuracy (or confidence)cvSet2D( img, i, j, response == 1 ?(accuracy > 5 ? CV_RGB(180,0,0) : CV_RGB(180,120,0)) :(accuracy > 5 ? CV_RGB(0,180,0) : CV_RGB(120,120,0)) );}}// display the original training samplesfor( i = 0; i < train_sample_count/2; i++ ){CvPoint pt;pt.x = cvRound(trainData1.data.fl[i*2]);pt.y = cvRound(trainData1.data.fl[i*2+1]);cvCircle( img, pt, 2, CV_RGB(255,0,0), CV_FILLED );pt.x = cvRound(trainData2.data.fl[i*2]);pt.y = cvRound(trainData2.data.fl[i*2+1]);cvCircle( img, pt, 2, CV_RGB(0,255,0), CV_FILLED );}cvNamedWindow( "classifier result", 1 );cvShowImage( "classifier result", img );cvWaitKey(0);cvReleaseMat( &trainClasses );cvReleaseMat( &trainData );return 0;}
Classification Result:

KNN is easy to understand and implement. At the same time, KNN has a high classification result and is not sensitive to abnormal values. However, the computing complexity is high and it is not suitable for big data classification.

(Reprinted please indicate the author and Source: http://blog.csdn.net/xiaowei_cqu is not allowed for commercial use)


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.