Evaluation and selection of machine learning model

Source: Internet
Author: User

2.1 Experience Error and overfitting

Basic concepts:

Error Rate: Number of classification errors/total number of samples

Training error/Experience Error: The error generated by the learner in the training set

Generalization error: The error generated by the learner on the test set

2.2 Evaluation methods

In the actual application there will be a variety of different algorithms to choose, for different problems, we choose which learning algorithm and parameter configuration, machine learning model selection. The generalization error cannot be obtained directly, the training error is not suitable for the existence of the overfitting phenomenon, and how we evaluate and select the model.

The test set is randomly obtained from the training set, and the test set and the training set are mutually exclusive. A few common practices are described below, through the proper processing of D, from which training sets S and test set T are generated.

2.1 Set aside method (2/3~4/5)

Note: Avoid introducing additional deviations due to the data partitioning process two resulting in impact on the results

Method: Stratified sampling (sampled separately for different categories)

A number of random repeat divisions are evaluated and averaged.

2.2 Cross-validation method (10 times 10 percent)

Method: Divide the DataSet into K-sized mutually exclusive subsets, then use k-1 as the training set, leaving one as the test set

Note: 10 random resampling

2.3 Self-help method

Method: There is a size of the number of samples to be put back

Note: The data generated by the self-help method changes the distribution of the initial data set, which introduces the estimation bias, so the method of retention and cross-validation is more common when the initial data volume is sufficient.

2.3 Performance Metrics

The ability to measure the generalization of a model is a performance metric. When comparing the capabilities of different models, using different performance metrics often results in different judgments. Good or bad is relative, not only depends on the algorithm and data but also depends on the task demand.

Error Rate: The sample of the classification error takes up the overall sample.

  Recall (Recall): Judging positive cases, accounting for all positive cases

  precision Ratio (precision): Judging as a positive case, judging the correct proportion

  p-r Curve (area):

According to the study of the prediction results of the order of the sample, ranked in the front of the learner is "most likely" as a positive example sample, ranked in the back is the learner think "the most unlikely", in order as a positive example to predict, calculate precision and recall, and as the horizontal and vertical axis, showing the corresponding p-r map.

  The balance points (break-even point, BEP) are compared to obtain the intersection of the line y=x and the P-r curve.

  F1: Harmonic averaging based on precision and recall:

  FB: A measure of different preferences for recall rate and precision ratio:

For the above assessment method (retention method, cross-validation method), I can get multiple confusion matrices. Usually there are two approaches, one is "macro F1" to solve each confusion matrix after the overall averaging. The other is "micro-F1" first to the confusion matrix and then calculate the value of F1.

ROC and AUC

ROC (Receiver Operating characteristic) "subjects ' working characteristics"

Model evaluation and selection of machine learning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.