Detailed classification evaluation index and regression evaluation index and Python code implementation

Source: Internet
Author: User

This article introduces the content of the detailed classification evaluation indicators and regression evaluation indicators and Python code implementation, has a certain reference value, now share to everyone, there is a need for friends to refer to.

1. Concept

Performance measurement (evaluation) indicators, the main divided into two major categories:
1) Classification Evaluation Index (classification), main analysis, discrete, integer. Specific indicators include accuracy (accuracy rate), precision (accuracy), recall (recall), F-value, P-r curve, Roc Curve, and AUC.
2) regression evaluation index (regression), which mainly analyses the relationship between integers and real numbers. Its specific indicators include the available variance score (explianed_variance_score), mean absolute error Mae (mean_absolute_error), mean square error MSE (MEAN-SQUARED_ERROR), RMS difference Rmse, Cross-entropy Lloss (Log loss,cross-entropy loss), R-Square value (determination factor, r2_score).

1.1. Prerequisites

Suppose that there are only two classes-the positive class (positive) and the negative class (negative), usually the class of interest is the positive class, and the other classes are negative (so many kinds of problems can be summed up as two types)
The confusion matrix (confusion matrix) is as follows

Actual category Forecast Category

Is Negative Summarize
Is Tp Fn P (actual positive)
Negative Fp Tn N (actual negative)

Table AB Mode: The first represents the right error for the prediction result, and the second represents the category of the prediction. As TP indicates, true Positive, that is, the correct prediction is the positive class; FN indicates that false negative, that is, the wrong prediction for the negative class.

2. Evaluation index (performance measurement)

2.1. Classification Evaluation Index

2.1.1 Value indicators-accuracy, Precision, Recall, F values

Measure Accuracy (accuracy rate) Precision (exact rate) Recall (recall rate) F value
Defined The ratio of the number of samples correctly categorized to the total number of samples (predicted as the proportion of real spam messages in spam messages) The ratio of the true positive example to the positive example number in the positive case (all real spam messages are classified to find the correct proportion) is correctly judged as the ratio of the positive case number to the total positive case number Accuracy and recall rate of the harmonic average F-score
Said accuracy=

precision=

Recall=

F -score =

1.precision is also often referred to as precision, recall called recall
2. The more commonly used is F1,

python3.6 Code implementation:

#调用sklearn库中的指标求解from Sklearn Import metricsfrom sklearn.metrics import Precision_recall_curvefrom sklearn.metrics Import average_precision_scorefrom sklearn.metrics Import accuracy_score# gives categorical results y_pred = [0, 1, 0, 0]y_true = [0, 1, 1, 1]p Rint ("Accuracy_score:", Accuracy_score (Y_true, y_pred)) print ("Precision_score:", Metrics.precision_score (Y_true, y _pred) Print ("Recall_score:", Metrics.recall_score (Y_true, y_pred)) print ("F1_score:", Metrics.f1_score (Y_true, Y_ pred) Print ("F0.5_score:", Metrics.fbeta_score (Y_true, y_pred, beta=0.5)) print ("F2_score:", Metrics.fbeta_score (Y_ True, y_pred, beta=2.0))

2.1.2 Correlation Curve-p-r curve, ROC curve and AUC value

1) p-r curve
Steps:
1, from high to low the "score" value is sorted and sequentially as the threshold value threshold;
2, for each threshold value, "score" value is greater than or equal to the threshold test sample is considered to be a positive example, the other is a negative example. Thus forming a set of predictions.
eg.

0.9 as a threshold, the 1th test sample is a positive example, 2, 3, 4, 5 are negative examples
Get


forecast as a positive example forecast negative example total
normal case (score greater than threshold) 0.9 0.1 1
negative (score less than threshold) 0.2+0.3+0.3+0.35 = 1.15 0.8+0.7+0.7+0.65 = 2.85 4
precision =

recall=

In the section below the threshold value, as a negative example, the value of the predicted negative case is the correct prediction value, that is, if it is a positive example, then take TP, if it is negative, then take TN, which is the forecast score.
Python implements pseudo-code

#precision和recall的求法如上 # The main introduction to Python Paint Library import matplotlib.pyplot ad plt# mainly used for matrix operation of the library import numpy as np# import iris data and training see previous blog post ... #加入800个噪声特征, increase the complexity of the image # merges the 150*800 noise signature matrix with the 150*4 Iris DataSet column X = np.c_[x, np.random.RandomState (0). RANDN (N_samples, 200* N_features)] #计算precision, recall gets the array for I in range (n_classes):    #计算三类鸢尾花的评价指标, _ as a temporary name using    Precision[i], Recall[i], _ = Precision_recall_curve (y_test[:, I], y_score[:,i]) #plot作图plt. CLF () for I in Range (n_classes):    Plt.plot (Recall[i], precision[i]) Plt.xlim ([0.0, 1.0]) Plt.ylim ([0.0, 1.05]) Plt.xlabel ("Recall") Plt.ylabel (" Precision ") Plt.show ()

The P-r curve of the iris data set is obtained by adding the above code complete

2) ROC curve
Horizontal axis: Pseudo Positive example rate FP = fp/n
Longitudinal axis: true example rate TP = tp/n
Steps:
1, from high to low the "score" value is sorted and sequentially as the threshold value threshold;
2, for each threshold value, "score" value is greater than or equal to the threshold test sample is considered to be a positive example, the other is a negative example. Thus forming a set of predictions.

Similar to the P-R curve calculation, no longer repeat
The ROC image of the iris data set is

The AUC (area under Curve) is defined as areas under the ROC curve
The AUC value provides an overall value for the classifier. Usually the greater the AUC, the better the classifier, with a value of [0, 1]

2.2. Regression Evaluation Index

1) can be released variance score

2) mean absolute error MAE (Mean absolute error)

3) Mean variance MSE (Mean squared error)


4) Logistics return loss

5) Conformance evaluation-Pearson correlation coefficient method

Python Code implementation

From Sklearn.metrics import Log_losslog_loss (Y_true, y_pred) from scipy.stats import Pearsonrpearsonr (Rater1, Rater2) From Sklearn.metrics import Cohen_kappa_scorecohen_kappa_score (Rater1, Rater2)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.