The previous article mentioned the importance of error analysis and setting error metrics. That is to set a real number to evaluate the learning algorithm and measure its performance. With the evaluation and error metrics of the algorithm, one important thing to note is that using an appropriate error metric can sometimes have a very subtle effect on the learning algorithm. This kind of problem is the problem of the skew class (skewed classes). What do you mean? In the case of cancer classification, we have the characteristic variables of internal medicine patients and want to know if they have cancer, which is like the classification of malignant and benign tumors. Suppose Y=1 says the patient has cancer, assuming Y=0 says no cancer, and then trains the logistic regression model. Suppose that the classification model was tested with a test set and found that it was only 1% error. So we have 99% to make the right diagnosis, which looks like a very good result. 99% of the situation is correct. But if we find that only 0.5% of the patients actually have cancer in the test set, only 0.5% of the patients in our screening program are suffering from cancer. In this case, the error rate of 1% is no longer as good.
For example, here is a line of code that ignores the input value x, so that y is always equal to 0, so it always predicts that no one has cancer. Then this algorithm actually has only 0.5% error rate. So this is even better than the 1% error rate we got before, which is a non-machine learning algorithm because it just predicts that Y is always equal to 0.
is very small compared to the number of negative samples. Because Y=1 is very rare, we call this a skew class. A sample in one class is much more than the data of another class, and the algorithm can perform very well by always predicting y=0, or always predicting y=1. therefore, using categorical errors or categorical accuracy as an evaluation measure may have the following problems. If there is an algorithm, its accuracy is 99.2%, so it has only 0.8% error. Suppose you make a little change to your algorithm and get 99.5% accuracy, with only 0.5% error. Whether this is an improvement of the algorithm, one of the benefits of using a real number as an evaluation measure is that it can help us quickly decide if we need to make some improvements to the algorithm. Increase the accuracy from 99.2% to 99.5%, but does our improvement really work, or do we just replace the code with something like always predicting y=0, so if you have a skew class, using categorical accuracy is not a good way to measure an algorithm, Because you may get a high accuracy, or very low error rate, but we do not know whether we really improve the quality of the classification model , because always predict y=0, is not a good classification model. But always predicting y=0 will reduce your error to, say, down to 0.5%. When we encounter such a skew class, we want to have a different error measure or a different evaluation measure. One of the evaluation measures is called precision (precision) and recall (recall).
Let's say we're using a test set to evaluate a classification model, and for a sample in a test set, the sample in each test set will be equal to 0 or 1, assuming this is a two-point problem. Our learning algorithm is to make predictions for values, and the learning algorithm will make predictions for each instance of the test set. The predicted value is also equal to 0 or 1. The following is a 2x2 table of classes based on the actual classes and predictions. If there is a sample which actually belongs to a class of 1 and the predicted class is also 1, then we call this sample true positive (true positive), meaning that our learning algorithm predicts this value to be positive, and actually the sample is positive. If our learning algorithm predicts that a value is negative, equals 0, the actual class does fall into 0. So we call this true negative (true negative). The value we predict as 0 is actually equal to 0. Another two cells, if our learning algorithm predicts a value equal to 1, but in fact it equals 0, this is called false positive (false positive). For example, our algorithm predicts that some patients are suffering from cancer, but in fact they do not have cancer. Finally, this cell is 1 and 0, this is called false negative (false negative), because our algorithm predicts a value of 0, but the actual value is 1. So we have a way to evaluate the performance of an algorithm based on a 2x2 table of actual and predictive classes.
The two methods for evaluating skew class metrics are mentioned earlier. The first is the precision, which means that for all patients who we predict to have cancer, how much of the patient is actually suffering from cancer. that is, the precision of a classification model equals the true positive divided by all the numbers we predict as positive. For those patients, we told them, "You have cancer." For these patients, the ratio is really cancer, this is called the precision, the other is the molecule is a true positive, the denominator is the number of all positive, that is equal to the first row of the table and the value of the sum. This is called the precision ratio. The higher the precision, the better. The high precision indicates that for these patients, we have a high rate of accuracy in predicting that they have cancer.
Another we want to calculate is called recall rate. The recall rate was that all patients in the test set or the patients in the cross-validation set did have cancer, and how much we predicted they had cancer correctly. The recall rate is defined as the number of true positives divided by the actual positive number. write this in a different form, that is true positive divided by true positive plus false negative. Similarly, the higher the recall rate, the better.
By calculating the precision ratio and recall rate, we can better know that the classification model is not good. Specifically, if we have an algorithm that always predicts y=0 that no one is suffering from cancer, the recall rate for this classification model is equal to 0 because it does not have a true positive. So we can quickly find that this classification model always predicts y=0. Not a good model. In general, even if we have a very skewed class, the algorithm is not able to "deceive" us. Just by predicting that Y is always equal to 0 or y is always equal to 1. It has no way of getting high precision and a high recall rate. So we can be more certain that the model with high precision or high recall rate is a good classification model, which gives us a better evaluation value and gives us a more direct way to evaluate the good and bad of the model. One last thing to keep in mind, in the definition of precision and recall, we define precision and recall rates, and we habitually use Y=1 to show that this class appears very little. So if we try to detect a very rare situation, like cancer. I hope it's a rare situation where precision and recall are defined as Y=1 rather than y=0, as some of the fewer classes we want to detect. In general, even if we have a very skewed class, as long as the precision and recall rate is very high, it can also indicate that the learning algorithm performance is very good .
Stanford University public Class machine learning: Machines Learning System Design | Error metrics for skewed classes (definition of skew class issues and evaluation measures for skew class issues: precision ratio (precision) and recall rate (recall))