The list of programs in machine learning combat 2-1 K nearest Neighbor algorithm what did Classify0 do?

Source: Internet
Author: User
Tags iterable

defstart (): Group,labels=CreateDataSet ()returnClassify0 ([3,3], group, labels, 4)defCreateDataSet (): Group= Array ([[[1,2],[2,3],[1,1],[4,5]])#freely defined here, representing a well-known sorted datasetLabels = ['A','A','B','B']    returnGroup, Labelsdefclassify0 (inx,dataset,labels,k):"""InX is the input test sample, is a [x, Y] style dataset is the training sample set labels is the training sample label K is the most similar to the top K"""    #the shape of the matrix is a tuple, and if Dataset.shape is called directly, it is returned (4,2), i.e.    #returns the matrix (number of rows, columns),    #so shape[0] Gets the number of rows in the dataset,    #the number of rows is the quantity of the sample    #Shape[1] Returns the number of columns in the datasetDatasetsize =Dataset.shape[0]################## #说明代码 ########################    Print("Dataset.shape[0] Returns the number of rows in the matrix:")    Print(datasetsize) cols= Dataset.shape[1]    Print("Dataset.shape[1] Returns the number of columns of a matrix:")    Print(cols)Print(Dataset.shape)Print("dataset.shape Type:")    Print(Type (dataset.shape))###################################################    #here mat is the abbreviation of Maxtrix, Diffmat, i.e. the difference of the matrix, the result is also the matrix    #For a description of the tile function, see http://www.cnblogs.com/Sabre/p/7976702.html    #Simply put Inx (this example is []) on the "line" of this dimension, copied the Datasetsize times (this example datasetsize==4), in the "column" dimension, copied 1 times    #form a matrix such as [[1,1],[1,1],[1,1],[1,1]] to operate with a dataset    #This is done because the Euclidean distance formula is used to find the distance between the input point and the existing point    #This is the 1th step, to give the difference between the point [4] and the known point, the output is a matrixDiffmat = Tile (InX, (datasetsize,1))-DataSet################## #说明代码 ########################    Print("Diffmat:"+str (diffmat))###################################################        #squared The matrix, that is, the squared of the differenceSqdiffmat = Diffmat * * 2################## #说明代码 ########################    Print("Sqdiffmat:"+str (sqdiffmat))###################################################    #sum (Axis=1) is the addition of the values in each row of the matrix, such as [[0 0] [1 1] [0 1] [9 9]] will get [0,2,1,18], get the sum of squares    #sum (axis=0) is the addition of the values in each column in the matrixSqdistances = Sqdiffmat.sum (Axis=1)    ################## #说明代码 ########################    Print("sqdistances:"+str (sqdistances))###################################################        #take the square and take the root, get the distance, the output arraydistances = sqdistances * * 0.5################## #说明代码 ########################    Print("the distance from the unknown point to each known point:", distances)###################################################    #Argsort (), Place the index of the elements in the array from small to large in the order of the small-to-large position    #after an array [0 2 1 18],argsort, get [0 2 1 3], the smallest is at the front, the position is 0, the second is the element with index 2, or 1    #The third Small is the index of 1, that is 2, the fourth is the index of 3, that is    #This ensures that the position of the original array element is unchanged so that the label can be matchedSorteddistindicies =Distances.argsort ()################## #说明代码 ########################    Print("Index Location:", Sorteddistindicies)###################################################        #Create an empty dictionaryClassCount = {}         #the K value is a comparison of the first K samples     forIinchRange (k):#returns the value indexed as sorteddistindicies[i] in distances        #in this example, the following are:        #sorteddistindicies[0]==0, then labels[0]== ' a ', voteilabel== ' a '        #sorteddistindicies[1]==2, then labels[2]== ' B ', voteilabel== ' B '        #sorteddistindicies[2]==1, then labels[0]== ' a ', voteilabel== ' a '        #sorteddistindicies[3]==18, then labels[0]== ' B ', voteilabel== ' B 'Voteilabel =Labels[sorteddistindicies[i]]################## #说明代码 ########################        Print("label"+ STR (i) +":"+Voteilabel)###################################################        #dict.get (Key, Default=none), returns its corresponding value for key keys, or returns default if the Dict does not contain a key (note that default is None)        #The first time you call Classcount.get, there is no value in ClassCountClasscount[voteilabel] = classcount.get (Voteilabel, 0) + 1################## #说明代码 ########################        Print("Section"+str (i+1) +"second visit, classcount["+ Voteilabel +"] value is:"+str (Classcount[voteilabel]))Print("the contents of the ClassCount are:")        Print(ClassCount)###################################################        #sorted (Iterable[,cmp,[,key[,reverse=true]])    #function: Return A new sorted list from the items in iterable.    #The first parameter is a iterable, and the return value is a list that sorts the elements in iterable.     #The optional parameters are three, CMP, key, and reverse.     #1) CMP Specifies a custom comparison function that receives two parameters (elements of iterable), returns a negative number if the first argument is less than the second argument, returns 0 if the first argument is equal to the second argument, or returns a positive number if the first argument is greater than the second argument. The default value is None.     #2) key specifies a function that receives a parameter, which is used to extract a keyword from each element for comparison. The default value is None.     #3) Reverse is a Boolean value. If set to True, list elements are sorted in reverse order.     #Operator.itemgetter (1) This is difficult to explain, with the following examples to understand    #a=[11,22,33]    #B = Operator.itemgetter (2)    #B (a)    #Output:    #B = Operator.itemgetter (2,0,1)    #B (a)    #output: (33,11,22)    #the Operator.itemgetter function returns not a value, but a function that acts on the object to get the value    #more complex, not much explanation hereSortedclasscount = sorted (Classcount.items (), Key=operator.itemgetter (1), reverse=True)Print(Sortedclasscount)#returns the smallest value after a positive order, that is, the value of "K min neighbor" determines the category of the test sample    Print("final result, test sample category:", end="")    Print(sortedclasscount[0][0])returnSORTEDCLASSCOUNT[0][0]

if __name__= ="__main__":    start ()

Output Result:

Dataset.shape[0] Returns the number of rows in the matrix:
4
Dataset.shape[1] Returns the number of columns of a matrix:
2
(4, 2)
dataset.shape Type:
<class ' tuple ' >
diffmat:[[2 1]
[1 0]
[2 2]
[ -1-2]]
sqdiffmat:[[4 1]
[1 0]
[4 4]
[1 4]]
sqdistances:[5 1 8 5]
distance from unknown point to each known point: [2.23606798 1.2.82842712 2.23606798]
index Position: [1 0 3 2]
label 0:a
1th visit, Classcount[a] value is: 1
the contents of the ClassCount are:
{' A ': 1}
Label 1: A
2nd visit, Classcount[a] value is: 2
the contents of the ClassCount are:
{' A ': 2}
label 2:b
3rd visit, Classcount[b] value is: 1
the contents of the ClassCount are:
{' A ': 2, ' B ': 1}
label 3:b
4th visit, Classcount[b] value is: 2
the contents of the ClassCount are:
{' A ': 2, ' B ': 2}
[(' A ', 2), (' B ', 2)]
final result, test sample category: A
[finished in 5.3s]

The list of programs in machine learning combat 2-1 K nearest Neighbor algorithm what did Classify0 do?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.