ROC AUC PRC Detail __ Machine learning

Source: Internet
Author: User
ROC (receiver operating characteristic curve)

The working characteristic curve of the subjects, the following figure

This diagram introduces the false positive rate, and the true rate. (feeling in reading a dream of Red mansions)
Tpr:true Positive Rate (true rate, TPR) or sensitivity (sensitivity)(is the recall of the previous R)  
TPR = TP/(TP + FN) Positive sample forecast number/Positive sample actual number

Tnr:true Negative Rate (true negative rate, TNR) or specificity (specificity)
TNR = TN/(TN + FP) Negative sample predicted number/Negative sample actual number

Fpr:false Positive Rate (false positive rate, FPR)
FPR = FP/(fp + TN) Negative sample result/Negative sample actual number predicted to be positive

Fnr:false Negative Rate (false negative rate, FNR)
FNR = FN/(TP + fn) Positive sample result number/Positive sample actual number predicted to be negative

Ideal goal: tpr=1,fpr=0, that is, in the graph (0,1) point, at this time fpr=0,tpr=1, both positive samples are divided into the positive class, negative samples are divided into negative categories. Conversely, (1,0), that is, fpr=1,tpr=0, a similar analysis can find that this is one of the worst classifiers, because it is the opposite of all the correct answers.
In other words, theROC chart is more left to the upper corner, the better the model effect. AUC

AUC (Area Under Curve) is the size of the ROC's go-to curve, integrating both the ROC curve. The larger the area, the better the model is considered. AUC and AP (average precision) are the same thing, AP often appears in the image processing. PRC (Precision recall curve)

Generally speaking, the above is better than below (the Green line is better than red). In other words, the more the curve to the right corner, the better the effect.

Summarize:
1 ROC,PRC,AUC These three quantities will be examined simultaneously. Among them, the ROC's TPR is the recall of PRC, this link will bring other advantages, see below.
2 for both curves, the smoother the positive model is, the better the threshold setting is in essence.

Now the question is, what is the best curve for Roc and PRC? It is obviously PRC, because PRC is the People's Republic of China.
To give a direct conclusion:
When the positive and negative sample gap is not large, the ROC and the PR trend is similar, but when a lot of negative samples, the two are very different, the ROC effect still seems very good, but the PR reflects the general effect. It is also simple to explain, assuming that there are 1 positive cases and 100 negative cases, then basically TPR may have remained at around 100, and then suddenly dropped to 0. The ROC curve and the PR curve in the case of (a) (b) 1:1 positive and negative samples, respectively, are relatively close. and (c) (d) Positive and negative sample ratio of 1:1, when the ROC curve effect is still very good, but the PR curve is relatively poor performance. This indicates that the PR curve can reflect the performance of classification better when the ratio of positive and negative samples is large.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.