"Turn" ROC and AUC Introduction and how to calculate AUC

Last Update:2016-09-17 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Transferred from: https://www.douban.com/note/284051363/

The ROC (Receiver Operating characteristic) curve and AUC are often used to evaluate the merits and demerits of a binary classifier (binary classifier), and a brief introduction to both [here] (http://bubblexc.com /y2011/148/). This blog post provides a brief description of the ROC and AUC features, as well as more in-depth discussion of how to make ROC graphs and calculate AUC.

# ROC Curve
In advance, we will discuss only binary classifiers here. For classifiers, or classification algorithms, the evaluation indicators are mainly precision,recall,f-score[^1], as well as the ROC and AUC we are discussing today. is an example of a ROC curve [^2].

As we can see in the example diagram of this ROC curve, the ROC curve has a horizontal axis of false positive rate (FPR), and the ordinate is true positive rate (TPR). The definition of FPR and TPR is described in detail.

Next we consider four points and a line in the ROC graph. The first point, (0,1), that is, fpr=0, tpr=1, which means FN (false negative) = 0, and FP (false positive) = 0. Wow, this is a perfect classifier that classifies all the samples correctly. The second point, (1,0), or fpr=1,tpr=0, is a similar analysis that can be found to be the worst classifier, because it successfully avoids all the right answers. The third point, (0,0), which is fpr=tpr=0, or FP (false positive) =TP (true positive) = 0, can be found that the classifier predicts that all samples are negative samples (negative). Similarly, the fourth point, the classifier actually predicts that all samples are positive samples. After the above analysis, we can assert that the closer the ROC curve to the upper-left corner, the better the performance of the classifier.

The points on the dashed y=x in the ROC curve are considered below. The point on this diagonal actually represents the result of a classifier with a random guessing strategy, for example (0.5,0.5), which indicates that the classifier randomly guesses a positive sample for half the sample and the other half is a negative sample.

# How to Draw Roc curves
For a particular classifier and test data set, it is obvious that we can only get a classification result, that is, a set of FPR and TPR results, and to get a curve, we actually need a series of FPR and TPR values, how to get it? Let's take a look at the definition of the ROC curve on [Wikipedia] (http://en.wikipedia.org/wiki/Receiver_operating_characteristic):

> In signal detection theory, a receiver operating characteristic (ROC), or simply ROC curve, is a graphical plot which Illustrates the performance of a binary classifier system as its discrimination threshold is varied.

The problem is that "as its discrimination threashold is varied". How to understand the "discrimination threashold" here? We ignore an important function of the classifier "probability output", which means that the classifier considers how large the probability of a sample belongs to a positive sample (or negative sample). By more in-depth understanding of the internal mechanisms of each classifier, we can always find a way to obtain a probabilistic output. In general, a real range is mapped to the (0,1) interval by a transform [^3].

If we have obtained the probability output of all samples (the probability of being a positive sample), the question now is how to change the "discrimination threashold"? We sort from large to small according to the probability values of each test sample that belong to a positive sample. is an example of a total of 20 test samples, the "Class" column represents a real label for each test sample (p indicates a positive sample, n is a negative sample), and "score" indicates the probability of each test sample being a positive sample [^4].

Next, we start from high to low, then the "score" value as the threshold value threshold, when the test sample is the probability of a positive sample is greater than or equal to this threshold, we think it is a positive sample, otherwise a negative sample. For example, for the 4th sample in the diagram, whose "score" value is 0.6, the sample 1,2,3,4 are considered positive samples because their "score" values are greater than or equal to 0.6, while the other samples are considered negative samples. Each time a different threshold is selected, we can get a set of FPR and TPR, which is a point on the ROC curve. In this way, we get a total of 20 sets of FPR and TPR values, which are drawn in the ROC curve results such as:

When we set threshold to 1 and 0 o'clock, we can get (0,0) and (two) points on the ROC curve respectively. By connecting these (FPR,TPR) pairs, the ROC curve is obtained. The more the threshold value, the more smooth the ROC curve.

In fact, we do not necessarily have to get each test sample is a positive sample of the probability value, as long as the classifier to the test sample's "Scoring value" (The score value is not necessarily in the (0,1) interval). The higher the rating, the more definitely the classifier believes the test sample is a positive sample, and uses each scoring value as threshold. I think it's easier to interpret the scoring values into probabilities.

# The calculation of the AUC value
AUC (area under Curve) is defined as the size of the ROC curve, and it is clear that the area value will not be greater than 1. Because the ROC curve is generally above the y=x line, the AUC takes a value range between 0.5 and 1. The AUC value is used as the evaluation criterion because many times the ROC curve does not clearly explain which classifier works better, and as a numeric value, it is better to have a larger classifier for the AUC.

After understanding the construction process of the ROC curve, it is not difficult to write code implementations. Sometimes it's more painful to read other people's code than to write your own code. In this recommendation [Scikit-learn] (http://scikit-learn.org/stable/) for [calculating the AUC code] (https://github.com/scikit-learn/scikit-learn/ blob/master/sklearn/metrics/metrics.py#l479).

# Why use ROC curves
Since there are so many evaluation criteria, why use ROC and AUC? Because the ROC curve has a good feature: when the distribution of positive and negative samples in the test set changes, the ROC curve can remain unchanged. There is often a class imbalance in the actual data set (class imbalance), where negative samples are much more than positive samples (or vice versa), and the distribution of positive and negative samples in the test data can vary over time. Is the comparison of the ROC curve and the Precision-recall curve [^5]:

&NBSP;

In, (a) and (c) are the ROC curves, (b) and (d) are precision-recall curves. (a) and (b) show the results of classifying them in the original test set (distribution balance of positive and negative samples), (c) and (d) The result of the classifier by increasing the number of negative samples in the test set to the original 10 times times. It can be seen clearly that the ROC curve remains essentially the same, while the Precision-recall curve changes considerably.

Note that in addition to the first picture from Wikipedia, other diagrams are from the paper (Fawcett, 2006) [^6].
[^1]: I avoid the translation of evaluation indicators such as precision,recall into Chinese, because they may correspond to multiple Chinese explanations, it is very easy to create confusion.
[^2]: Image source: Http://en.wikipedia.org/wiki/File:Roccurves.png
[^3]: This mapping is not necessarily reliable, that is, you do not necessarily get a sample is a positive sample of the probability.
[^4]: Note that the use of "score", rather than the probability, we can assume that the "score" value is the probability of positive samples.
[^5]: Davis, J., & Goadrich, M. (2006, June). The relationship between Precision-recall and ROC curves. In Proceedings of the 23rd International Conference on Machine Learning (pp. 233-240). Acm.
[^6]: Fawcett, T. (2006). An introduction-to-ROC analysis. Pattern Recognition Letters, 27 (8), 861-874.

References and other Links:
* Introduction to ROC in Wikipedia: Http://en.wikipedia.org/wiki/Receiver_operating_characteristic
* ROC curve and AUC evaluation indicator by bubbling cui: http://bubblexc.com/y2011/148/

"Turn" ROC and AUC Introduction and how to calculate AUC

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

"Turn" ROC and AUC Introduction and how to calculate AUC

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

"Turn" ROC and AUC Introduction and how to calculate AUC

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support