Using the Python language to learn the K-nearest neighbor Classifier API
Welcome to my Git. View Source: Https://github.com/linyi0604/kaggle
1 fromSklearn.datasetsImportLoad_iris2 fromSklearn.cross_validationImportTrain_test_split3 fromSklearn.preprocessingImportStandardscaler4 fromSklearn.neighborsImportKneighborsclassifier5 fromSklearn.metricsImportClassification_report6 7 " "8 k Nearest Neighbor classifier9 make decisions on predictive data through the distribution of dataTen a kind of parameter estimation without parameters One Very high computational complexity and memory consumption A " " - - " " the 1 Preparing Data - " " - #reading Iris Data Set -Iris =Load_iris () + #Check Data size - #print (Iris.data.shape) # (4) + #View Data Description A #Print (Iris. DESCR) at " " - Iris Plants Database - ==================== - - Notes - ----- in Data Set Characteristics: - : Number of instances:150 (in three classes) to : Number of attributes:4 numeric, predictive Attributes and the class + : Attribute Information: - -Sepal length in cm the -sepal width in cm * -Petal length in cm $ -petal width in cmPanax Notoginseng -Class: - -Iris-setosa the -Iris-versicolour + -Iris-virginica A : Summary Statistics: the + ============== ==== ==== ======= ===== ==================== - Min Max Mean SD Class Correlation $ ============== ==== ==== ======= ===== ==================== $ sepal length:4.3 7.9 5.84 0.83 0.7826 - sepal width:2.0 4.4 3.05 0.43-0.4194 - Petal length:1.0 6.9 3.76 1.76 0.9490 (high!) the Petal width:0.1 2.5 1.20 0.76 0.9565 (high!) - ============== ==== ==== ======= ===== ====================Wuyi the : Missing Attribute values:none - : Class distribution:33.3% for each of the 3 classes. Wu : Creator:r.a Fisher - :D Onor:michael Marshall (marshall%[email protected]) About :D Ate:july, 1988 $ - This is a copy of the UCI ML Iris datasets. - Http://archive.ics.uci.edu/ml/datasets/Iris - A the famous Iris database, first used by Sir R.A Fisher + the This is perhaps, the best known, found in the - Pattern Recognition literature. Fisher ' s paper is a classic in the field and $ is referenced frequently to this day . (See Duda & Hart, for example.) the the Data set contains 3 classes of instances each, where each class refers to a the type of iris plant. One class is linearly separable from the other 2; the the latter is not linearly separable from each other. the - References in ---------- the -FISHER,R.A. "The use of multiple measurements in taxonomic problems" the Annual eugenics, 7, part II, 179-188 (1936); also in "Contributions to About mathematical Statistics "(John Wiley, NY, 1950). the -DUDA,R.O, & HART,P.E. (1973) Pattern classification and Scene analysis. the (Q327. D83) John Wiley & Sons. ISBN 0-471-22361-1. See page 218. the -Dasarathy, B.V. (1980) "Nosing Around the neighborhood:a New System + Structure and classification Rule for recognition in partially exposed - environments ". IEEE Transactions on Pattern analysis and machine the Intelligence, vol. PAMI-2, No. 1, 67-71.Bayi -Gates, G.W (1972) "The reduced Nearest Neighbor Rule". IEEE Transactions the On information Theory, may 1972, 431-433. the -See also:1988 MLC proceedings, 54-64. Cheeseman et al "s autoclass II - Conceptual Clustering system finds 3 classes in the data. - -Many, many more ... the the a total of 150 data samples the evenly distributed over 3 subspecies the 4 petals per sample, calyx shape Description - " " the the " " the 2 dividing the training set and the test set94 " " theX_train, X_test, y_train, y_test =train_test_split (Iris.data, the Iris.target, thetest_size=0.25,98Random_state=33) About - " "101 3 K Nearest Neighbor Classifier learning model and prediction102 " "103 #standardization of training data and test data104SS =Standardscaler () theX_train =ss.fit_transform (X_train)106X_test =ss.transform (x_test)107 108 #establish a K-nearest neighbor model object109KNC =Kneighborsclassifier () the #input training data for learning modeling111 Knc.fit (X_train, Y_train) the #to predict the test data113Y_predict =knc.predict (x_test) the the " " the 4 Model Evaluation117 " "118 Print("Accuracy rate:", Knc.score (X_test, y_test))119 Print("Other indicators: \ n", Classification_report (Y_test, Y_predict, target_names=iris.target_names)) - " "121 accuracy rate: 0.8947368421052632122 Other indicators:123 Precision recall F1-score support124 the setosa 1.00 1.00 1.00 8126 versicolor 0.73 1.00 0.85127 virginica 1.00 0.79 0.88 - 129 avg/total 0.92 0.89 0.90 the " "
Machine learning Path: The python k nearest Neighbor classifier Iris classification prediction