K Nearest Neighbor Classification algorithm implementation in Python

Source: Internet
Author: User

K Nearest neighbor (KNN): Classification algorithm

* KNN is a non-parametric classifier (not to make the assumption of distribution form, to estimate the probability density directly from the data), is memory-based learning.

* KNN does not apply to high dimensional data (curse of dimension)

* Machine learning a lot of Python libraries, such as mlpy (more packages), here to achieve just to master the method

* Matlab in the call, see "Matlab classifier Encyclopedia (SVM,KNN, random forest, etc.)"

* KNN algorithm is highly complex (available in KD tree optimization, C can be used Libkdtree or Ann)

* k smaller is easier to fit, but K is very general to reduce classification accuracy (imagine limit: K=1 and K=n (sample number))

This article does not introduce theory, see the code for comments.

knn.py

[Python]View PlainCopy  
  1. From NumPy Import *
  2. Import operator
  3. Class KNN:
  4. def createdataset (self):
  5. Group = Array ([[1.0,1.1],[1.0,1.0],[0,0],[0,0.1]])
  6. Labels = [' A ',' a ',' B ',' B ']
  7. return Group,labels
  8. def knnclassify (self,testx,trainx,labels,k):
  9. [N,m]=trainx.shape
  10. #calculate the distance between TESTX and other training samples
  11. difference = Tile (TESTX, (N,1))-Trainx # Tile for array and repeat for matrix in Python, = = Repmat in Matlab
  12. difference = difference * * 2 # Take POW (difference,2)
  13. Distance = Difference.sum (1) # take the sum of difference from all dimensions
  14. Distance = distance * * 0.5
  15. Sortdiffidx = Distance.argsort ()
  16. # Find the K nearest neighbours
  17. Vote = {} #create the dictionary
  18. For I in range (K):
  19. Ith_label = Labels[sortdiffidx[i]];
  20. Vote[ith_label] = Vote.get (Ith_label,0) +1 #get (ith_label,0): If dictionary ' vote ' exist key ' Ith_label ', ret Urn Vote[ith_label]; else return 0
  21. Sortedvote = sorted (Vote.iteritems (), key = Lambda x:x[1], reverse = True)
  22. # ' key = Lambda x:x[1] ' can be substituted by Operator.itemgetter (1)
  23. return sortedvote[0][0]
  24. K = KNN () #create KNN Object
  25. Group,labels = K.createdataset ()
  26. CLS = k.knnclassify ([0,0],group,labels,3)
  27. Print CLS

-------------------
Run:

1. You can run knn.py in the Python Shell

>>>import OS

>>>os.chdir ("/USERS/MBA/DOCUMENTS/STUDY/MACHINE_LEARNING/PYTHON/KNN")

>>>execfile ("knn.py")

Output b

(b = category)

2. Or run directly in terminal

$ python knn.py

3. You can also not write the output in the knn.py, and choose to get the result in the shell, i.e.

>>>import KNN

>>> knn.k.knnclassify ([0,0],knn.group,knn.labels,3)

from:http://blog.csdn.net/abcjennifer/article/details/19757987

K Nearest Neighbor Classification algorithm implementation in Python

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.