difference between accuracy rate (Precision) and correct rate (accuracy) of classification index

Source: Internet
Author: User

Http://www.cnblogs.com/fengfenggirl/p/classification_evaluate.html

First, Introduction

There are many classification algorithms, and different classification algorithms use many different variants. Different classification algorithms have different specific, different data sets on the performance of different, we need to choose according to the specific task of the algorithm, how to choose the classification, how to evaluate a classification algorithm, the previous decision tree Introduction, we mainly use the correct rate (accuracy) to evaluate the classification algorithm.

The correct rate is really a very good and intuitive evaluation index, but sometimes the correct rate is high and does not represent an algorithm is good. For example, a certain area of the earthquake prediction, suppose we have a bunch of features as an attribute of the earthquake classification, the category only two: 0: No earthquakes, 1: earthquakes occur. A non-thinking classifier that divides the category into 0 for each test case, then it is likely to achieve a 99% accuracy rate, but when the real earthquake comes, the classifier is unaware that the loss from this classification is huge. Why 99% of the correct rate of the classifier is not what we want, because there is uneven distribution of data, Category 1 of the data is too small, completely wrong sub-category 1 can still achieve a high rate of accuracy but ignore the things we are concerned about. Next, the evaluation index of classification algorithm is introduced in detail.

II. Indicators of Evaluation

1, a few common terms

Here we first introduce a few common model evaluation terms, and now assume that we have only two categories of classification targets, the positive examples (positive) and negative examples (negtive) are:

1) True Positives (TP): is correctly divided into the number of positive cases, that is, the actual case and the classifier is divided into a positive example of the number of instances (sample number);

2) False positives (FP): is incorrectly divided into the number of positive cases, that is, the actual negative example, but the classifier is divided into a positive example of the number of instances;

3) False negatives (FN): The number of negative examples is incorrectly divided, that is, the actual case, but the classifier is divided into negative examples;

4) True negatives (TN): is correctly divided into negative cases, that is, the actual negative case and the classifier is divided into negative examples of the number of instances.

Real

Inter -

Class

Don't

Forecast Category

Yes

No

Total

Yes

Tp

Fn

P (The actual yes )

No

Fp

Tn

N (The actual no )

Total

P ' (is divided into Yes )

N ' (is divided into no )

P+n

Is the four terms of the confusion matrix, I only know the FP called pseudo-Yang rate, the other how to address the unknown. Note that P=TP+FN is actually a positive example of the number of samples, I have mistakenly thought that the actual sample number should be TP+FP, here just remember that true, false describes whether the classifier is correct, Positive, negative is the classification of the classifier results. If the positive example is counted as 1, the negative example is 1, namely positive=1, Negtive=-1, 1 means false, then the actual class label =TF*PN,TF true or FALSE,PN is positive or negtive. For example true positives (TP) the actual class label =1*1=1 is a positive example, false positives (FP) of the actual class label = (-1) *1=-1 is a negative example, false negatives (FN) of the actual class label = (-1) * (-1) =1 as a positive example, True negatives (TN) is a negative example of the actual class =1* (-1) =-1.

2. Evaluation Index

1) correct rate (accuracy)

The correct rate is our most common evaluation indicator, accuracy = (TP+TN)/(P+n), which is easy to understand, is divided by the number of samples to divide by all the number of samples, usually, the higher the correct rate, the better the classifier;

2) Error Rate

The error rate is the opposite of the correct rate, which describes the proportion of the wrong division of the classifier, error rates = (FP+FN)/(P+n), for an instance, the sub-pair and sub-fault is mutually exclusive event, so accuracy =1-error rate;

3) sensitivity (sensitive)

sensitive = tp/p, which represents the proportion of all positive cases being divided, measures the recognition ability of the classifier to the positive example;

4) Effect degree (specificity)

specificity = tn/n, which represents the proportional proportion of all negative cases, measures the recognition ability of the classifier to the negative case;

5) accuracy (precision)

Precision is the measure of accuracy, representing the proportion of the example that is divided into a positive example, precision=tp/(TP+FP);

6) recall rate (recall)

Recall is a measure of coverage, a number of positive cases are divided into positive cases, recall=tp/(TP+FN) =tp/p=sensitive, you can see the recall rate and sensitivity is the same.

7) Other evaluation indicators

    • Computational speed: The time required for classifier training and forecasting;
    • Robustness: The ability to deal with missing and outlier values;
    • Scalability: The ability to handle large data sets;
    • Explanatory: The understandable nature of the classifier's predictive standard, like the rules created by the decision tree, is easy to understand, and a bunch of neural network parameters are poorly understood, and we have to think of it as a black box.

For a specific classifier, it is not possible to improve all of the above mentioned indicators at the same time, of course, if a classifier can be correctly divided into all instances, then the indicators have been optimized, but such a classifier often does not exist. For example, we start with the earthquake prediction, no one can accurately predict the occurrence of earthquakes, but we can tolerate a certain degree of false positives, assuming that 1000 predictions, there are 5 predictions for the discovery of earthquakes, one of the real earthquake, while the other 4 times for false positives, then the correct rate from the original 999/1000=99.9% Down to 996/1000=99.6, but the recall rate rose from 0/1=0% to 1/1=100%, so although the number of false claims of earthquakes, but the real earthquake came, we did not miss, such a classifier is what we want, in a certain correct rate, we require the classification of the recall rate as high as possible.

Http://blog.sciencenet.cn/blog-460603-785098.html

Classification is an important data mining algorithm. The purpose of classification is to construct a classification function or a classification model (i.e., a classifier) that maps data objects to a given category through a classifier. The main evaluation indexes of classifier are accuracy rate (Precision), recall rate (Recall),Fb-score, ROC, AOC and so on. In the study, accuracy (correct rate) was used to evaluate the classifier. But the two concepts of accuracy and correctness are often mixed. "If you don't have the patience to read the following, please see the final conclusion."

Accuracy (Precision) and recall (Recall) are the two most basic indicators in the field of information retrieval. Accuracy is also known as precision, recall rate is also called recall. They are defined as follows:

precision= the number of related files retrieved by the System/Total amount of files retrieved by the system

Recall= the number of related files retrieved by the system/system all related files

F b-score is the harmonic average of the accuracy and recall rate:fb=[(1+B2) *p*r]/(B2*p+r), the more commonly used is F1.

In the information retrieval, the accuracy rate and the recall rate are mutual influence, although both are high is an ideal situation, but in practice, the accuracy is often high, the recall rate is low, or the recall rate is low, but the accuracy rate is high. So in practice often need to make a choice according to the specific situation, for example, the general search situation is to ensure that the recall rate to improve the accuracy rate, and if it is disease monitoring, anti-spam, etc., in the guarantee accuracy rate of the conditions, to enhance the recall rate. Sometimes, however, you need to balance both, so you can use the F-score indicator.

ROC and AUC are indicators of the evaluation classifier. The ROC is a shorthand for the subject's working characteristic curve receiver operating characteristic curve), also known as the susceptibility curve (sensitivity curve). The reason for this is that the points on the curve reflect the same sensitivity, which are all reactions to the same signal stimulation, but are the result of several different criteria [1]. ROC is a comprehensive index that reflects sensitivity and specific continuous variables, and is used to reveal the relationship between sensitivity and specificity by using the composition method, which can calculate a series of sensitivity and specificity by setting a number of different critical values for successive variables, and then draw a curve with the sensitivity as ordinate and (1-specific) as the horizontal axis. The AUC is the abbreviation for the area under the ROC curve (areas under Roc Curve), as the name implies, the value of the AUC is the size of that portion of the area below the ROC Curve. In general, the value of AUC is between 0.5 and 1.0, and the greater the AUC, the higher the diagnostic accuracy. On the ROC curve, the point closest to the top left of the chart is a critical value with high sensitivity and specificity.

To explain the ROC concept, let's consider a two classification problem, in which instances are divided into positive classes (positive) or negative classes (negative). There are four scenarios for a two-point problem. If an instance is a positive class and is also predicted to be a positive class, that is the true class (true positive), if the instance is a negative class that is predicted to be a positive class, it is called a false positive class (false positive). Correspondingly, if the instance is negative class is predicted to be negative class, called the true Negative class (truenegative), the positive class is predicted to be negative class is the false negative class (falsenegative). The list or confusion matrix is shown in the following table, 1 represents the positive class, and 0 represents the negative class.

Actual

1

0

Forecast

1

True Positive (TP)

Real

False Positive (FP)

False positive

0

False Negative (FN)

False negative

True Negative TN

True Negative

Based on this list, the definition of the sensitivity indicator is: sensitivity=tp/(TP+FN). The sensitivity indicator, also known as the true class rate (truepositive rates, TPR), depicts the proportion of positive instances identified by the classifier as being accounted for by all positive instances.

In addition, negative positive class rates (false positive rate, FPR) are defined, and the formula is: fpr=fp/(FP+TN). The negative positive class rate is calculated as the ratio of negative instances of positive class to all negative instances

The definition-specific indicator is: specificity=tn/(fp+tn) =1-fpr. The specific indicator is also called the true negative class rate (true negative RATE,TNR).

We see, in fact, the sensitivity index is the recall rate, the specificity of the indicator =1-FPR.

The ROC curve is drawn from two variables. The horizontal axis is 1-specificity, that is negative positive class rate (FPR), the ordinate is sensitivity, namely the real class rate (TPR).

On this basis, you can also define the correct rate (accuracy) and error rate (errors). accuracy= (TP+TN)/(TP+FP+TN+FN), error= (FP+FN)/(TP+FP+TN+FN). If the forecast is 1 as the result of the search, the accuracy rate is precision= tp/(TP+FP).

Conclusion:

Classification accuracy (accuracy), regardless of the category, as long as the prediction is correct, its number is placed on the molecule, and the denominator is the total number of data, which indicates that the correct rate is the judgment of all the data. The accuracy rate in the classification corresponds to a category, the numerator is the correct number of predictions for that category, the denominator is the number of all data predicted for the class. Alternatively, accuracy is an evaluation of the overall accuracy of the classifier, while precision is the evaluation of the classifier's prediction of the correct rate for a given category.

Https://argcv.com/articles/1036.c

In the fields of natural language processing (ML), machine learning (NLP), Information Retrieval (IR), Evaluation (Evaluation) is a necessary work, and its evaluation indicators tend to have the following points: accuracy (accuracy), accuracy (Precision), Recall (Recall) and F1-measure.

This article will briefly describe several of these concepts. In Chinese, the translation of these evaluation indicators is different, so in general, the use of English is recommended.

Now I'm going to assume a specific scenario as an example.

If a class has a boy, a girl , a total of the people. The goal is to find all the girls.
Now someone to pick out a person, of whom is a girl, in addition to the wrong 30 boys also as a girl selected.
As the evaluator you need to evaluate (evaluation) under his work

First we can calculate the accuracy rate (accuracy), which is defined as: for a given test data set, the classifier correctly classifies the ratio of the number of samples to the total number of samples. That is, the loss function is a 0-1 loss when the accuracy rate on the test data set [1].

This sounds a bit abstract, simply put, in front of the scene, the actual situation is that the class has male and female two categories, someone (that is, the definition of the classifier) he divided the class of men and women into two categories. What accuracy needs to get is the proportion of people who are right in the total . It's easy for us to get: he decided that 70 (20 female +50 men) were correct and the total number was 100, so its accuracy was 70 (70/100).

By accuracy, we can indeed get a classifier in some sense to be effective, but it is not always effective to evaluate the work of a classifier. For example, Google crawled ARGCV 100 pages, and it has a total of 10,000,000 pages in the index, randomly draw a page, category, this is not the ARGCV page? If I take accuracy to judge my work, I will judge all the pages as " Not ARGCV's page ", because I'm so efficient (return false, a word), and accuracy has reached 99.999% (9,999,900/10,000,000), and after a lot of other classifiers have worked hard to calculate the value, And my algorithm is obviously not the demand, how to solve it? This is the time for Precision,recall and f1-measure to play.

Before we talk about Precision,recall and f1-measure, we need to define the Tp,fn,fp,tn four classification cases first.
According to the previous example, we need to find all the girls from a class, if the task as a classifier, then the girl is what we need, and the boys are not, so we call the girls as "positive class", while the boys are "negative class".

Correlation (relevant), positive class Irrelevant (nonrelevant), negative class
Retrieved (retrieved) True positives (TP positive class is judged to be a positive class, the example is the correct decision "This is a Girl") False positives (FP negative class to determine the positive class, "save Pseudo", the example is clear is the boys are judged to be girls, now pseudo-Niang rampant, this mistake often committed)
Not retrieved (not retrieved) False negatives (FN positive class is judged to be negative class, "Go to the truth", the example is, is clearly a girl, this man is judged as a boy----------The fault of the students is this) True negatives (TN negative class is judged negative class, that is, a boy is judged as a boy, a pure man like me eaten will be here)

With this table, we can easily get these values:
Tp=20
Fp=30
Fn=0
Tn=50

The formula for the accuracy rate (precision) is p=tptP+Fp , it calculates the proportion of item "should be retrieved" in all retrieved item.

In the example, you want to know the proportion of the right person (that is, the girl) who is the one who gets it. So its precision is 40% (20 girls/(20 girls and +30 male students)).

recall rate (recall) formula is R=tptP+FN , it calculates the ratio of all retrieved item to all "item should be retrieved".

In the example is to want to know this June get girls accounted for all the girls in this class ratio, so its recall is 100% (20 girls/(20 girls + 0 male students))

The F1 value is the harmonic mean of the exact value and the recall rate, i.e.
2F1=1P+1R
The adjustment is also
F1=2Pr P+r = 2t p 2tp +f P+f n

In the example, F1-measure is approximately 57.143% (2∗0.4∗10.4+1 ).

It should be noted that someone [2] listed such a formula
FA= (a 2+ 1) p ra2 (p+< Span id= "mathjax-span-112" class= "Mi" >r)
Generalize the f-measure.

F1-measure that the accuracy and recall weights are the same, but in some scenarios we might think that accuracy is more important, adjusting parameter A and using Fa-measure to help us better evaluate results.

Although a lot of words, actually achieve very easy, click here to see a simple implementation of my.

References

[1] Hangyuan li. Statistical learning methods [M]. Beijing: Tsinghua University Press, 2012.
[2] accuracy rate (Precision), recall rate (Recall) and comprehensive Evaluation Index (F1-MEASURE)

difference between accuracy rate (Precision) and correct rate (accuracy) of classification index

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.