Code:
1 #-*-coding:utf-8-*-2 """3 Created on Thu Jul 09:36:49 20184 5 @author: Zhen6 """7 """8 The influence of the size of n_neighbors on the predictive precision and generalization ability of K-nearest neighbor algorithm9 """Ten fromSklearn.datasetsImportLoad_breast_cancer One A fromSklearn.model_selectionImportTrain_test_split - - fromSklearn.neighborsImportKneighborsclassifier the - ImportMatplotlib.pyplot as Plt - -Cancer =Load_breast_cancer () + -X_train, X_test, y_train, y_test =Train_test_split ( +Cancer.data, Cancer.target, Stratify=cancer.target, random_state=66) A atTraining_accuracy = [] - -Test_accuracy = [] - - #n_neighbors value from 1~10 -Neighbors_settings = Range (1, 11) in - forN_neighborsinchneighbors_settings: to #Building a model +CLF = Kneighborsclassifier (n_neighbors=n_neighbors) - Clf.fit (X_train, Y_train) the #recording Training set accuracy S * training_accuracy.append (Clf.score (X_train, y_train)) $ #documenting generalization capabilitiesPanax Notoginseng test_accuracy.append (Clf.score (X_test, y_test)) - thePlt.plot (Neighbors_settings, Training_accuracy, label="Training Accuracy") +Plt.plot (Neighbors_settings, Test_accuracy, label="Test Accuracy") A thePlt.xlabel ("n_neighbors") +Plt.ylabel ("accuracy") - $Plt.legend ()
Results:
Summary: The predictions on the training set are perfect (close to 100%) when considering only single neighbors. However, as the number of neighbors increases, the model becomes simpler (the better the generalization ability), and the training set precision decreases. In order to obtain better prediction accuracy and generalization ability, the best performance in neighbors is about 6!
View the effect of neighbors size on prediction accuracy and generalization ability of K nearest neighbor classification algorithm