I. Assessment METHODOLOGY
1. Leave the method
The set up method directly divides the dataset D into two mutually exclusive collections, one of which is used as the training set S and the other as the test set T, and the T is used to evaluate the test error as an estimate of the generalization error when the model is trained at S.
2. Cross-validation method
"Cross-validation" first divides the dataset D into K-sized mutually exclusive subsets, each of which maintains the consistency of the data distribution as far as possible, i.e., from D through stratified sampling, and then every time the k-1 subset is used as the training set, the remaining subset is used as the test set, and the K Group training/test set is obtained. This allows for K-training and testing, which ultimately returns the mean value of the K test results.
two. Performance Metrics
Real Situation |
positive examples of prediction results |
Counter-Example of prediction results |
Positive example |
TP (real example) |
FN (anyway) |
Counter Example |
FP (False positive example) |
TN (True counter example) |
1. The so-called precision ratio p and recall r are respectively defined as:
P=tptp+fp,r=tptp+fn
2. f1:1f1=12 (1P+1R)
3.ROC and AUC
The transverse axis is the false positive example rate, the longitudinal axle is the real example rate, the curve is roc,auc the area below the curve, the larger area indicates the better performance of the classifier.
Why use ROC and AUC to evaluate classifiers
Since there are so many standards, why use ROC and AUC? Because the ROC curve has a good feature: when the distribution of positive and negative samples in the test set is transformed, the ROC curve can remain unchanged. In the actual data set, the sample class imbalance often occurs, that is, the positive and negative sample ratio is large, and the positive and negative samples in the test data may change over time.