Learning OPENCV--KNN algorithm

Source: Internet
Author: User

Transferred from: http://blog.csdn.net/lyflower/article/details/1728642

KNN algorithm in text classification, the idea of this method is very simple and intuitive: if a sample in the feature space in the K most similar (that is, the most adjacent in the feature space) of the sample is a category, then the sample belongs to this category. This method determines the category to which the sample is to be divided based on the category of the nearest one or several samples in the categorical decision-making.

Although the KNN method relies on the limit theorem in principle, it is only related to a very small number of adjacent samples in the class decision. Therefore, the problem of sample imbalance can be better avoided by using this method. In addition, the KNN method is more suitable than other methods, because the KNN method mainly relies on the surrounding finite sample, rather than the Discriminant class domain method to determine the category of the class, so it is more appropriate for the sample set to be divided by the cross or overlap.

The disadvantage of this method is that it is computationally large, because each text to be classified is calculated from its distance to all known samples in order to obtain its K nearest neighbor points. At present, the common solution is to pre-edit the known sample points in advance to remove the small sample of the role of classification. In addition, there is a reverse KNN method, which can reduce the computational complexity of KNN algorithm and improve the efficiency of classification.

This algorithm is suitable for the automatic classification of the class domain with large sample capacity, while those with smaller sample capacity are more prone to error points.

K Neighbor classifier has good text classification effect, the statistical analysis of simulation results shows that: as a text classifier, K nearest neighbor is second only to support vector machine, and obviously better than linear least squares fitting, naive Bayesian and neural network.

Focus:

1: Characteristic dimensionality reduction (general with Chi method)

2: Truncated algorithm (three truncation algorithms)

3: Reduce the amount of computation

Demo Code:

[CPP]View Plain copy print?
  1. #include "ml.h"
  2. #include "highgui.h"
  3. int main (int argc, char** argv)
  4. {
  5. const int K = 10;
  6. int I, j, K, accuracy;
  7. float response;
  8. int train_sample_count = 100;
  9. Cvrng rng_state = cvrng (-1);//Initialize random number generator state
  10. cvmat* traindata = Cvcreatemat (Train_sample_count, 2, CV_32FC1);
  11. cvmat* trainclasses = Cvcreatemat (Train_sample_count, 1, CV_32FC1);
  12. iplimage* img = cvcreateimage (cvsize (500, 500), 8, 3);
  13. float _sample[2];
  14. Cvmat sample = Cvmat (1, 2, CV_32FC1, _sample);
  15. Cvzero (IMG);
  16. Cvmat trainData1, TrainData2, TrainClasses1, TrainClasses2;
  17. form the training samples
  18. Cvgetrows (Traindata, &traindata1, 0, TRAIN_SAMPLE_COUNT/2); Returns a row of an array or a line within a certain span
  19. Cvrandarr (&rng_state, &traindata1, Cv_rand_normal, Cvscalar (200,200), Cvscalar (50,50)); Fills an array with random numbers and updates the RNG state
  20. Cvgetrows (Traindata, &traindata2, TRAIN_SAMPLE_COUNT/2, Train_sample_count);
  21. Cvrandarr (&rng_state, &traindata2, Cv_rand_normal, Cvscalar (300,300), Cvscalar (50,50));
  22. Cvgetrows (trainclasses, &trainclasses1, 0, TRAIN_SAMPLE_COUNT/2);
  23. Cvset (&trainclasses1, Cvscalar (1));
  24. Cvgetrows (trainclasses, &trainclasses2, TRAIN_SAMPLE_COUNT/2, Train_sample_count);
  25. Cvset (&trainclasses2, Cvscalar (2));
  26. Learn classifier
  27. Cvknearest KNN (traindata, trainclasses, 0, False, K);
  28. cvmat* nearests = Cvcreatemat (1, K, CV_32FC1);
  29. for (i = 0; i < img->height; i++)
  30. {
  31. for (j = 0; J < img->width; J + +)
  32. {
  33. Sample.data.fl[0] = (float) j;
  34. SAMPLE.DATA.FL[1] = (float) i;
  35. Estimates the response and get the neighbors ' labels
  36. Response = Knn.find_nearest (&sample,k,0,0,nearests,0);
  37. Compute the number of neighbors representing the majority
  38. for (k = 0, accuracy = 0; k < K; k++)
  39. {
  40. if (nearests->data.fl[k] = = response)
  41. accuracy++;
  42. }
  43. Highlight the pixel depending on the accuracy (or confidence)
  44. CVSET2D (IMG, I, j, response = = 1?)
  45. (Accuracy > 5?) Cv_rgb (180,0,0): Cv_rgb (180,120,0)):
  46. (Accuracy > 5?) Cv_rgb (0,180,0): Cv_rgb (120,120,0));
  47. }
  48. }
  49. Display the original training samples
  50. for (i = 0; i < TRAIN_SAMPLE_COUNT/2; i++)
  51. {
  52. Cvpoint pt;
  53. Pt.x = Cvround (traindata1.data.fl[i*2]);
  54. Pt.y = Cvround (traindata1.data.fl[i*2+1]);
  55. Cvcircle (IMG, PT, 2, Cv_rgb (255,0,0), cv_filled);
  56. Pt.x = Cvround (traindata2.data.fl[i*2]);
  57. Pt.y = Cvround (traindata2.data.fl[i*2+1]);
  58. Cvcircle (IMG, PT, 2, Cv_rgb (0,255,0), cv_filled);
  59. }
  60. Cvnamedwindow ("classifier result", 1);
  61. Cvshowimage ("classifier result", IMG);
  62. Cvwaitkey (0);
  63. Cvreleasemat (&trainclasses);
  64. Cvreleasemat (&traindata);
  65. return 0;
  66. }

Detailed Description: http://www.cnblogs.com/xiangshancuizhu/archive/2011/08/06/2129355.html
The improved knn:http://www.cnblogs.com/xiangshancuizhu/archive/2011/11/11/2245373.html

from:http://blog.csdn.net/yangtrees/article/details/7482890

Learning OPENCV--KNN algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.