Machine learning Notes (vi) Performance metrics

Last Update:2018-07-26 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The generalization performance of the learner is evaluated.

In the measure, we should have the evaluation criterion to measure the generalization ability of the model.

Performance metrics reflect task requirements, and using different performance metrics often leads to different judgments when comparing the capabilities of different models, which means that the model is relatively good or bad, and that the model is not only dependent on the algorithm and data, but also on the task requirements. I. Error rate and accuracy error rate is the proportion of samples with the wrong number of samples, and the progress is the proportion of the correct number of samples to the total number of samples. two. Precision ratio, recall and F1 check rate is high is to represent as accurate as possible to jump out, recall refers to as far as possible to pick out all the right, that is, rather wrong to kill and not let go. For the two classification problem, the sample can be divided into real example, false positive example according to the combination of real category and learning period prediction category, True counter example, false inverse example, set to TP\FP\TN\FN, obviously have tp+fp+tn+fn= sample total. The confusion matrix for the classification results is as follows

Real situation	Forecast results
	Positive example	Counter Example
Positive example	Tp	Fn
Counter Example	Fp	Tn

The precision ratio P and recall R respectively are defined as p=tp/(TP+FP) r=tp/(TP+FN) precision and recall is a set of contradictory measures, this high and low. Take algorithm rate as ordinate, check full rate for horizontal axis, can draw p-r diagram. P-r characteristics: If a learner's curve is fully wrapped by another learner's curve, it means that another semester is better than the learner's performance. If there is a cross, the general calculation area of large performance is better. The equilibrium point is the point when the curve recall is equal to the precision ratio, and it can be judged by this point. f1= (2*p*r)/(P+R) =2*tp/(sample Total +tp-tn) is also a discriminant method. In complex cases, it is also important to consider the importance of recall or precision, as well as the method of averaging under multiple confusion matrices. three. Roc and AUCMany learners actually produce a threshold value for the test sample, which is higher than the positive class and vice versa. It can also be called a truncation point. In different tasks, we have to take into account the precision and recall of the importance of high and low, which we can rely on ROC. We sort the samples according to the predicted results of the learner, test each sample as a positive example in this order, and calculate the TPR and FPR as vertical axes each time. That is, the real case rate and false positive rate. tpr=tp/(TP+FN) fpr=fp/(TN+FP) is similar in character to P-r diagram. The AUC refers to the area covered by the ROC curve. The formal view of AUC considers the ordering quality of sample predictions, so it is closely related to sequencing errors. four. Cost-sensitive error rate and cost curveIn practical applications, you must also consider the cost of judging errors, such as in the access control system, if put in a bad person, or put a good person outside the door, will affect the user experience. The usual method should be to set a weight for any error, in the calculation of the cost-sensitive error rate, you must consider this weight, can not simply directly calculate the number of errors.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Machine learning Notes (vi) Performance metrics

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Machine learning Notes (vi) Performance metrics

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support