Principle: Calculate the current point (no label, typically a test set) and other points (there is a label, generally a training set) of the distance and sort in ascending order, select K minimum distance points, according to the category of K points to vote, the category of the most votes is the category corresponding to that point.
Code implementation (the dataset uses Iris):
1 ImportNumPy as NP2 fromSklearn.datasetsImportLoad_iris3 fromSklearn.model_selectionImportTrain_test_split4 fromSklearnImportNeighbors5 fromSklearn.metricsImportAccuracy_score6 7 defGet_iris ():8Iris_data =Load_iris ()9X_train, X_test, y_train, y_test = Train_test_split (Iris_data.data, Iris_data.target, test_size=0.4, random_state=0)Ten returnX_train, X_test, Y_train, Y_test One A defknn_classify (self_point, DataSet, labels, k): -Distance = [Np.sqrt (sum ((self_point-d) **2)) forDinchDataSet] -Train_data =zip (distance, labels) theTrain_data = sorted (Train_data, key=LambdaX:x[0]) [: K] -Self_label = {} - forIinchTrain_data: -i = str (i[1]) +Self_label[i] = Self_label.setdefault (i, 0) + 1 -Self_label = sorted (Self_label, Key=self_label.get, reverse=True) + returnSelf_label[0] A at -X_train, X_test, y_train, y_test =Get_iris () -Size =Len (y_test) -Count =0 - forTinchRange (len (x_test)): -Y_pre = Knn_classify (X_test[t], X_train, Y_train, 5) in ifY_pre = =str (y_test[t]): -Count + = 1 to Print('the exact rate of custom:', Count/size) + - #using the Sklearn built-in KNN theKNN = neighbors. Kneighborsclassifier (n_neighbors=5) * Knn.fit (X_train, Y_train) $Pre =knn.predict (x_test)Panax Notoginseng Print('accuracy rate of Sklearn:', Accuracy_score (Y_test, Pre))
Comparison results:
Custom's accuracy rate: 0.95
Accuracy rate of Sklearn: 0.95
Python implements a simple KNN algorithm