Classification and evaluation index of machine learning algorithms

Last Update:2017-04-21 Source: Internet

Author: User

Tags ord

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

For the introduction of machine learning, we need some basic concepts:

Definition of machine learning

M.mitchell the definition in machine learning is:

For a certain type of task T and performance Metric p, if a computer program is self-perfecting with experience E in the performance of P measured on T, then we call this computer program to learn from experience E.

Algorithm classification

Two pictures are a good summary of the (machine Learning) algorithm classification:

Evaluation indicator Classification (classification) algorithm indicators:

Accuracy accuracy Rate
Precision accuracy Rate
Recall recall Rate
F1 Score

The results for the classification problem can be expressed in the following table (note: True or false Indicates whether the predicted results are correct, positive and negative represent the results found by the program):

Accuracy accuracy Rate

The accuracy is defined as the ratio of the number of samples correctly categorized by the classifier to the total number of samples for a given test data set. The formula is:

Accuracy rate of the existence of the paradox of accuracy, refer to the specific instructions here.

Precision accuracy Rate

The exact rate is calculated as the proportion of the predicted result that conforms to the actual value, which can be understood as having no " false positives ", the formula is:

Recall Recall Rate

Recall rate is calculated: The correct classification of the number and all "should" be correctly classified (in line with the target label) the proportion of the number can be understood as the exact rate corresponding to the absence of " false negatives " situation. The formula is:

F1 Score

The F1 value is the harmonic mean of the accuracy and recall, defined as:

That

Application Scenarios:

Accuracy and recall are mutually influential, ideally it must be done both high, but in general the accuracy is high, the recall rate is low, the recall rate is low, the accuracy is high, of course, if both are low, that is where the problem. When both precision and recall rates are high, the value of the F1 is high. In the case of both requirements, it can be measured by F1.

Prediction of earthquakes
What we hope for in the earthquake prediction is that the recall is very high, that is to say, every earthquake we want to predict. We can sacrifice precision at this time. 1000 alarms are preferred, 10 earthquakes are predicted correctly, and do not predict 100 times 8 leaks two times.
Suspects convicted
Based on the principle of not blaming a good man, we hope to be very accurate about the conviction of a suspect. In time, some criminals were spared (recall low), but also worthwhile.

Regression (Regression) algorithm indicator:

Mean Absolute Error Average absolute deviation
Mean squared error mean square errors
R2 Score
Explained Variance Score

Average absolute Error

Formula:

Mean square error

Formula:

R2 Score

That is, "coefficient of determination" determines the degree to which the predicted model and the true data fit, the best value is 1, can be negative.

Yˉtˉtˉt=1N∑NI=1yi

Explained Variance Score

Reference

"1": http://scikit-learn.org

"2": Machine learning Concept Reference: http://underthehood.blog.51cto.com/2531780/577854

"3": Machine Learning Summary: Links

Classification and evaluation index of machine learning algorithms

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More