2nd Chapter KNN Algorithm Note _ function classify0

Source: Internet
Author: User

"Machine learning Combat" knowledge points notes Directory

K-Nearest Neighbor algorithm (KNN) idea:

1, calculate the distance from the unknown sample to all known samples

2, according to the distance increment sorting, select the first K sample (K<20)

3, for the K sample statistics of the number of occurrences of each classification, the maximum number of times classified as unknown sample classification

function Classify0 Although only a few lines of code, the knowledge points involved are very many, specific knowledge points are organized as follows:

I. List of procedures 2-1 notes
1,shape function
The shape function is a function in numpy.core.fromnumeric, and its function is to look at the dimensions of a matrix or array.
Like what:
Group = Array ([[[1.0,1.1],[1.0,1.0],[0,0],[0,0.1]])
Print (group)
Print (Group.shape)
Print ("group.shape[0]=%d"% group.shape[0])
The results are as follows:

The DataSet is as follows:
[1.1.1]
[1.1.]
[0.0.]
[0.0.1]]
(4, 2)
Group.shape[0]=4


2,tile function
Tile (array, number of repetitions on line, repetitions on column))
Like what:
Array1 = [A]
Print (Tile (array1, (2,1)))
Print (Tile (array1, ()))
Print (Tile (array1, (2,2)))

The results are as follows:

[[1 2 3]
[1 2 3]]
[[1 2 3 1 2 3]]
[[1 2 3 1 2 3]
[1 2 3 1 2 3]]


3,sum function. SUM (Axis=1)
We usually use the sum should be the default axis=0 is the normal sum
When you join Axis=1, you add a vector of each line of a matrix.
Such as:
Array2 = [[0,1,2],[0,3,4]]
Print (sum (Array2,axis=1))
Print ("\ n")
The results are as follows:

[3 7]

4,sort functions and Argsort functions
Sort function sorted by array value from small to large
The Argsort function returns the index value of the array value from small to large
Such as:
Array3 = [3,2,1]
Print (Argsort (ARRAY3))
Print (sort (array3))
Print ("\ n")

The results are as follows:

[2 1 0]
[1 2 3]


5, the meaning of the parameter K of the dictionary get method
Dic.get (key,k) = Dic.get (key, default)
K means: When a key is not present in the dictionary dic, the default value K is returned, and the value corresponding to the key is present.
As follows:

Dic1 = {"A": 1, "B": 2, "C": 3}
Print ("dic test")
Print (Dic1.get ("C", 0))
Print (Dic1.get ("D", 0))
Print (Dic1.get ("E", 1))

The results are as follows:

DIC test
3
0
1


6, the Iteritems function of the dictionary:
Note: Dict does not exist in the Python3 Iteritems,python2
You can use items instead

Dic1 = {"A": 1, "B": 2, "C": 3}
Print ("Item of the test dictionary")
Print (Dic1.items ())
# Python3 Dict does not exist Iteritems ' dict ' object has no attribute ' Iteritems '
#print (Dic1.iteritems ())

The test results are as follows:

Test the dictionary for item
Dict_items (' A ', 1), (' B ', 2), (' C ', 3)])

7,operator.itemgetter Defining a function
Operator.itemgetter (k) defines a function that returns the value of the K-field
Like what:

Print ("Test Operator.itemgetter")
a=[1,2,3]
B=operator.itemgetter (2) #定义函数b, gets the value of the first field of an object
Print (b (a))
B=operator.itemgetter (1,0) #定义函数b, gets the value of the 1th and No. 0 fields of an object
Print (b (a))

Test results:


Test Operator.itemgetter
3
(2, 1)

Second, all the test code:
From numpy Import *
Import operator

Def createdataset ():
Group = Array ([[[1.0,1.1],[1.0,1.0],[0,0],[0,0.1]])
Labels = [' A ', ' a ', ' B ', ' B ']
Return Group,labels


# # Test ##########################
Group,labels = CreateDataSet ()
Print ("DataSet as follows:")
Print (group)
Print (Group.shape)
Print ("group.shape[0]=%d"% group.shape[0])
Print ("\ n")

Print (labels)
Print ("\ n")

Array1 = [A]
Print (Tile (array1, (2,1)))
Print (Tile (array1, ()))
Print (Tile (array1, (2,2)))
Print ("\ n")

Array2 = [[0,1,2],[0,3,4]]
Print (sum (Array2,axis=1))
Print ("\ n")

Array3 = [3,2,1]
Print (Argsort (ARRAY3))
Print (sort (array3))
Print ("\ n")

Dic1 = {"A": 1, "B": 2, "C": 3}
Print ("dic test")
Print (Dic1.get ("C", 0))
Print (Dic1.get ("D", 0))
Print (Dic1.get ("E", 1))

#测试字典的Item
Print ("Item of the test dictionary")
Print (Dic1.items ())
# Python3 Dict does not exist Iteritems ' dict ' object has no attribute ' Iteritems '
#print (Dic1.iteritems ())
Print ("\ n")

#测试operator. Itemgetter
Print ("Test Operator.itemgetter")
a=[1,2,3]
B=operator.itemgetter (2) #定义函数b, gets the value of the first field of an object
Print (b (a))
B=operator.itemgetter (1,0) #定义函数b, gets the value of the 1th and No. 0 fields of an object
Print (b (a))


##### #函数定义

def classify0 (inx,dataset,labels,k):
Datasetsize = dataset.shape[0]
Diffmat = Tile (InX, (datasetsize,1))-DataSet
Print ("Diffmat")
Print (Diffmat)
Print ("\ n")

Sqdiffmat = Diffmat * * 2
Print ("Sqdiffmat")
Print (Sqdiffmat)
Print ("\ n")

Sqdistances = Sqdiffmat.sum (Axis=1)
Print ("Sqdistances")
Print (sqdistances)
Print ("\ n")

distances = sqdistances * * 0.5
Print ("Distances")
Print (distances)
Print ("\ n")


Sorteddistindicies = Distances.argsort ()
Print ("Sorteddistindicies")
Print (sorteddistindicies)
Print ("\ n")

#统计前K个样本, the number of times each label appears
ClassCount = {}
For I in range (k):
Voteilabel = Labels[sorteddistindicies[i]]
Print ("i=%s sorteddistindicies[i]=%s voteilabel=%s"% (I,sorteddistindicies[i],voteilabel))
Classcount[voteilabel] = Classcount.get (voteilabel,0) + 1
Print (ClassCount)

Print ("\ n")
Sortedclasscount = sorted (Classcount.items (), key = Operator.itemgetter (1), reverse=true)
Print ("Sortedclasscount")
Print (Sortedclasscount)
Print ("\ n")

Print ("returned in categories:%s", sortedclasscount[0][0])
return sortedclasscount[0][0]


Print ("Start sorting function .......")
Classify0 ([0,0],group,labels,3)


Third, the operation results are as follows:
"D:\Program Files\python36\python.exe" e:/code/python/machinelearninginaction/chapter02_knn/knn.py
The DataSet is as follows:
[1.1.1]
[1.1.]
[0.0.]
[0.0.1]]
(4, 2)
Group.shape[0]=4


[' A ', ' a ', ' B ', ' B ']


[[1 2 3]
[1 2 3]]
[[1 2 3 1 2 3]]
[[1 2 3 1 2 3]
[1 2 3 1 2 3]]


[3 7]


[2 1 0]
[1 2 3]


DIC test
3
0
1
Test the dictionary for item
Dict_items (' A ', 1), (' B ', 2), (' C ', 3)])


Test Operator.itemgetter
3
(2, 1)
Start to perform the classification function .....
Diffmat
[[-1.-1.1]
[-1.-1.]
[0.0.]
[0.-0.1]]


Sqdiffmat
[1.1.21]
[1.1.]
[0.0.]
[0.0.01]]


Sqdistances
[2.21 2. 0.0.01]


Distances
[1.48660687 1.41421356 0.0.1]


Sorteddistindicies
[2 3 1 0]


I=0 sorteddistindicies[i]=2 voteilabel=b
{' B ': 1}
I=1 sorteddistindicies[i]=3 voteilabel=b
{' B ': 2}
i=2 sorteddistindicies[i]=1 Voteilabel=a
{' B ': 2, ' A ': 1}


Sortedclasscount
[(' B ', 2), (' A ', 1)]


The returned category is:%s B

Process finished with exit code 0

"Machine learning Combat" knowledge points notes Directory

2nd Chapter KNN Algorithm Note _ function classify0

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.