Comparison of several boost algorithms (discrete AdaBoost, real AdaBoost, logitboost, gentle AdaBoost) __ machine learning

Source: Internet
Author: User
about boost Algorithm


The boost algorithm is a set of integrated learning Algorithms (ensemble learning) based on the PAC learning theory (probably approximately correct). The fundamental idea is to construct a strong classifier with high accuracy by using several simple weak classifiers, and the PAC learning theory confirms the feasibility of this method. The following comparison of several boost algorithms is based on the article "Additive Logistic regression a statistical View of boosting" sorted.

  several boost algorithm steps



The most commonly used should be the discrete AdaBoost algorithm (discrete AdaBoost), mainly because of its simple but decent performance, discrete AdaBoost algorithm steps are as follows:






It can be seen that the output of each weak classification of discrete adaboost is 1 or-1, and there is no probability of belonging to a class, slightly coarser.
If you let each weak classifier output the probability that a sample belongs to a class, you can get the real adaboost algorithm, which steps as follows:






Real AdaBoost the probability of each of the weak classifiers to be of a certain class, the probability value of 0-1 is mapped to the actual field by a logarithmic function, and the last classifier is the and of all mapping functions.



By merging the real AdaBoost algorithm into two parts of each iteration and directly producing a function that maps to the field of reals, it becomes the gentle AdaBoost, and its algorithm steps are as follows:






Gentle AdaBoost, in each iteration, makes a weighted regression based on the least squares, and the final classifier of all regression functions.



The logitboost algorithm is a bit like the gentle adaboost, but the variable z is constantly updated each time the regression is fitted, and gentle adaboost uses Y. Logitboost algorithm steps are as follows:



the principle difference of 4 boost algorithms



In the above 4 boost algorithm, its general structure is relatively similar, then how to derive the specific form of each algorithm.
The first is about the loss function (or cost function), usually see more is the mean square error and likelihood function, and the above algorithm, discrete AdaBoost, real AdaBoost and gentle AdaBoost algorithm are using logarithmic loss function, the specific form is as follows:




The meaning of the expression is essentially the same as the number of classification errors.
The logit boost algorithm is deduced by maximizing the logarithmic likelihood function.
The 2nd is the specific optimization method, discrete adaboost and real adaboost are mainly optimized by a similar gradient descent method, while gentle adaboost and logit boost are optimized in the same way as Newton iterations.


  The effect difference of the algorithm



In the reference article mentioned above, the effects of several algorithms are compared in a large amount, as follows; The overall effect, the effect from good to poor order for Logit Boost,gentle AdaBoost, real AdaBoost, discrete AdaBoost if the weak classifier using a stump model (that is, as long as 2 leaf nodes of the decision tree), discrete The results of adaboost are much worse than those of the other 3 algorithms. Probably because the system deviation is too large, the generalization error is larger if the weak classifier uses the multi-layer decision tree (4 or 8 leaf nodes), the results of discrete adaboost can be improved greatly, while the other 3 algorithms have little difference.



Usually we use the AdaBoost algorithm is mostly discrete AdaBoost, from here can be seen discrete AdaBoost algorithm model is relatively simple, the need for weak classifier precision slightly higher, Therefore, it is best to control the leaf nodes of each weak classifier to 4 or 8 in specific applications.
There are many more interesting conclusions about the boost algorithm, which is not much to say, and can refer to the paper above.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.