Read the book's first algorithm-k-neighbor algorithm, the overall structure of the algorithm is relatively simple, in the ACM in the words corresponding to the KD tree structure. First, we need to give the training set, then provide the test data, find out the K data which is closest to the test data, and determine the properties of our test data according to the attributes of the K data.
The example in the book is to give four points and the label of these four points, respectively, is a,a,b,b, now given a test pilot, you need to base on these four training sets to determine whether the label should be a or B.
1 fromNumPyImport*2 Importoperator3 4 defCreateDataSet ():5Group = Array ([[[1.0,1.1], [1.0,1.0], [0,0], [0,0.1]])6Labels = ['A','A','B','B']7 returnGroup, Labels8 9 defclassify0 (InX, DataSet, labels, k):TenDatasetsize =Dataset.shape[0] OneDiffmat = Tile (InX, (datasetsize,1))-DataSet #统一矩阵 to achieve add-and-subtract ASqdiffmat = diffmat**2 -Sqdistances = Sqdiffmat.sum (Axis=1)#to accumulate, axis=0 is by column, Axis=1 is by row -distances = sqdistances**0.5#Open Radicals theSorteddistindicies = Distances.argsort ()#sort in ascending order, return the original subscript -ClassCount = {} - forIinchRange (k): -Voteilabel =Labels[sorteddistindicies[i]] +Classcount[voteilabel] = classcount.get (Voteilabel, 0) + 1#get是字典中的方法, preceded by the value to be obtained, followed by the default value if the value does not exist -Sortedclasscount = sorted (Classcount.items (), Key=operator.itemgetter (1), reverse=True)
#在python3中没有iteritems, Key here is sorted by the second element of the dictionary, sorted in descending order + returnsortedclasscount[0][0] #获得字典中第一对映射中的第一个值 A at - if __name__=="__main__": -DataSet, labels =CreateDataSet () -InX = [0.2, 0.2] - Print(Classify0 (InX, DataSet, labels, 2))
Code Explanation:
①array
This is the NumPy library, which is used to construct the matrix:
1 from Import Array 2 3 A = array ([[[],4 [2,3]])5 b = Array ([[0,1], 6 [3,2]]7 c = A-b8print(c)
②tile () function
Tile (a,repes) returns the matrix of shape = Repes, each element is a
1 from Import Tile 2 3 A = [4] b = Tile (A, (2,3))5print(b)
③argsort ()
Order, return to the original subscript
1 from Import Argsort 2 3 a = [4,3,1,2]4 b = Argsort (a)5print(b)
K-Nearest neighbor algorithm of machine learning combat