Def classify0 (inx,data_set,labels,k):
DATA_SET_SIZE=DATA_SET.SHAPE[0]
. shape[0] indicates How many rows to get Data_set,. shape[1] that gets the number of columns
also note The use of brackets after shape
Diff_mat=tile (InX, (data_size,1))-data_size
Tile can extend an array, the above code is the InX in the array copy data_size , anyway, it is not good to express, you can Baidu. Using this will save you from using a for loop.
Sq_diff_mat=dif_mat**2
Sq_distances=sq_diff_mat.sum (Axis=1)
Axis=1 means Adding the elements in each line
Axis=0 means Adding the elements in each column (default)
distance=sq_distance**0.5
Sorted_dist_indicies=distance.argsort ()
Sort the distance from small to large, but store the number in sorted_dist_indicies
class_count={}
Create a dictionary that stores The number of K predictor tags
For I in range (k):
Vote_label=labels[sorted_dist_indicies[i]]
Class_count[vote_label]=class_count.get (vote_label,0) +1
The. Get method returns the value of the key vote_label ,0 If the key is not in the dictionary , Add the key and assign a value of 0.
Sorted_class_count=sorted (Class_count.items (), Key=operator.itemgetter (1), reverse=true)
I use the python3, so the items used here , if it is python2, it will be used . Iteritems (), Key=operator.itemgetter (1), which represents the sort value ,reverse=true indicates a descending arrangement, and the above code returns an element as a tuple ( tuple) the list (list)
Return Sorted_class_count[0][0]
returns the label of the forecast
The first algorithm for today's study--k-nearest neighbor algorithm