Trade-offs between Preision and recall
Is still an example of cancer prediction, when predicted as cancer, y=1; generally as a logistic regression we are hθ when >=0.5 (x) Y=1;
When we want to be more confident in predicting cancer (for patients who say they have cancer that will have a significant effect on them, let them go to therapy, so be more sure to tell the patient cancer predictions): We can set the threshold to 0.7, At this point we will have a high precision (because the cancer is very confident), and a low value recall, if threshold is set to 0.9---> High precision, and a low value recall
When we want to avoid missing patients with cancer (avoid false negatives, i.e. we do not want a patient to have a cancer, but we do not tell him, delaying his treatment): Set threshold to 0.3, when we get a low precision (there are many cancer that are actually mistaken) and a high recall (because most of the cancer are labeled).
So for most regression models, we need to weigh precision and recall.
The Precision&recall curve (which changes with the change of threshold), as shown on the right, has a number of possibilities for the Precision&recall curve, depending on the specific algorithm.
So can we automatically pick the right threshold?
How to choose the right threshold
The threshold value of the above three algorithms is different, that is, precision and recall value are different, then we should choose which of the above three models? ----We need an evaluation question value (evaluation metric) to measure.
Precison and recall cannot be evaluation metric, because they are different two numbers (this is the elimination).
If we use the average to do this evaluation metric: we can see that the average value of algorithm 3 is the largest, but the algorithm 3 is not a good algorithm, because we can predict all Y to 1 (will be threshold down) to achieve high recall, Low precision, which is obviously not a good algorithm, but it has a very good average, so we can not use average as evaluation metric.
F score (or F1 score): Used in machine learning to measure precision and recall evaluation metric (used to select threshold), when Precison and recall have an hour, The F-value obtained by this formula is also very small, which prevents the error that we mentioned above by using the average to measure. That is, as long as the F value is large, then the precision and recall are larger.
If precision or recall has a value of 0 for 0,f, and if it is a perfect model, that is, precision and recall are 1, then the F value is 1, so the range of F values in reality is between 0-1.
Summarize
- Trade-offs between precision and recall (change their values by changing the threshold)
- Different threshold correspond to different precison and recall, how to choose the appropriate threshold to get a good model (through the F-value on the cross validation set model selection)
- If you want to select threshold automatically, try a series of different thresholdand select them on the cross validation
Handling skewed data---trading off precision and recall