Learn machinelearning with Google [1]

Last Update:2017-12-23 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Today, I found that machinelearning's application threshold has been reduced to such a low, almost easy. I really can't find any reason not to get into the deep understanding of it. As a headline, thank Google for its contribution to this technology development. Of course, maybe someone else did the 99%, Google did only 1%, I want to say, really beautiful 1%.

Cut to the chase, today's engineers from Youtube have completed the first machinelearning applet. As a Hello world to learn this skill.

is for the record.

1  fromScipy.spatialImportDistance2 defEUC (A, b):3     returndistance. (A, B)4 5 classknnclassifier ():6     defFit (self, X_train, Y_train):7Self.x_train =X_train8Self.y_train =Y_train9 Ten     defPredict (self, x_test): Onepredictions = [] A          forRowinchx_test: -Label =self.closest (Row) - predictions.append (label) the         returnpredictions -  -     defclosest (self, row): -Best_dist =EUC (Row, self.x_train[0]) +Best_index =0 -          forIinchRange (1, Len (self.x_train)): +Dist =EUC (Row, self.x_train[i]) A             ifDist <best_dist: atBest_dist =Dist -Best_index =I -         returnSelf.y_train[best_index] -  -  fromSklearnImportDatasets -Iris =Datasets.load_iris () inx =Iris.data -y =Iris.target to  +  fromSklearn.cross_validationImportTrain_test_split -X_train, X_test, y_train, y_test = Train_test_split (x, Y, test_size=. 5) the PrintX_train * PrintY_train $ Panax NotoginsengMy_classifier =Knnclassifier () - My_classifier.fit (X_train, Y_train) thepredictions =my_classifier.predict (x_test) +  A  fromSklearn.metricsImportAccuracy_score the PrintAccuracy_score (y_test, predictions)

A simple explanation of the above code:

1.1-3 rows are functions that reference the Euclidean distance in the distance class of scipy and are simply encapsulated. (Euclidean distance: The true distance between two points in N-dimensional space)

2.5-25, the classifier class is defined, and the key methods include fit and predict. Fit mainly assigns the incoming data to the internal variable, and the predict is returned to the expected Label based on the row sent in. The classifier here is our hand code, not the training. In fact, it's not really a machinelearning, but it's a good explanation of its internal principles. in machinelearning, we define the closet function that will be trained to, that is, model.

3.27-30, in the database of iris flowers in the Sklearn library, as a source of data for our later experiments. Iris_data is the raw data of three flowers and is a three-dimensional array. Each element in the array represents the three parameters of a flower, the length of the XX, the xx width of the flower, and the length of XX (I do not relate to what data he is, but the data of the flowers); Iris_target is the type of flower that corresponds to data, which is probably 0 for red roses, 1 for blue roses, 2 means pink roses and the like.

4.32-35, the loaded flowers data split into two groups, a group used as train, as a predictive credential, and another set as the test classifier accuracy of the data to be measured. Validation, because the validation group of data corresponding to the results are also known, so take classifier out of the results and the real value of the comparison, we know classifier is reasonable. Using the above code to determine the success rate has reached >90%, in fact, take it to determine the unknown new data, the result is very high credibility.

5.37-39, the application of the classifier defined in 2, 4 is divided into the x_train, Y_train feed to classifier. Then, using classifier according to the flower data in x_test, predict the type of flower, get the corresponding prediction result array predictions.

6.41-42, compare the actual flower species y_test with the predicted results of predictions between the degree of conformity. Can see is not 100%, information will always be missing, even if the human eye to judge the same.

Because the loaded data is random at split. Therefore, because the train group and the test group of data are different, the accuracy of the predictions will be slightly different.

Although the classifier here has a high degree of accuracy, however, it is not possible to avoid this calculation, the computational capacity is very large. At the same time, because of the relationship of our data attributes, we can predict directly by finding the closest data, in some other applications, some properties are not linearly distributed, or not by the human eye can find the law. At this time, the real train is needed.

Learn machinelearning with Google [1]

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More