Ensemble methods (combination method, integrated method)

Last Update:2018-07-26 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Machine learning algorithms, the most discussed is a specific algorithm, such as decision TREE,KNN, in the actual work and Kaggle competition, Ensemble methods (combination method) The effect is often the best, of course, the need to consume training time will be elongated.

The so-called ensemble methods, is to combine several machine learning algorithms together, or to combine the different parameters of an algorithm. Basically divided into the following two categories:

Averaging methods (average method), is to use the complete or part of the training data to train a number of algorithms or some of the parameters of an algorithm, the final algorithm is all the arithmetic average of these algorithms. such as bagging Methods (bagging algorithm), Forest of randomized Trees (random forest) and so on.

In fact, this is relatively simple, the main work is the choice of training data, such as is not a random sampling, is not put back, select how many data sets, select most training data. The following training is the training of each algorithm, then the comprehensive average. The basic algorithm of this method usually chooses very strong and complex algorithm, then averages it, because a single strong algorithm can easily lead to overfitting (Overfit phenomenon), and after aggregate, this problem is eliminated.

Boosting methods (lifting algorithm), is the use of a basic algorithm to predict, and then in the subsequent other algorithms using the results of the previous algorithm, focusing on the error data, so as to continuously reduce the error rate. The motive is to use several simple weak algorithms to achieve a very powerful combinatorial algorithm. The so-called Ascension is to improve the "weak learning algorithm" (boost) as a "strong learning algorithm, is a gradual improvement of the process of learning, in a way, and neural network some similarities." Classic algorithms such as AdaBoost (Adaptive boost, adaptive boost), Gradient Tree boosting (GBDT).

This method generally chooses a very simple weak algorithm as the basic algorithm, because it will gradually improve, so the final few will be very strong. Boosting is relatively complex, there are several aspects: 1) Process:
2) How to get started. Choose a weak classifier, just a little bit more than a random guess, that is to say a>0. 3) How the preceding algorithm affects subsequent algorithms, or, how to improve the algorithm. The coefficients w of the preceding algorithm will be modified according to some algorithm to form the coefficients of subsequent algorithms, so how to modify the coefficient w is very critical. The different loss functions (the difference between the predicted value of the algorithm and the actual value of the data) and how to minimize the loss function determine how to update the coefficient W and determine the final effect of the boosting, and several common boosting:

4) The final algorithm representation:

Which, while E is the algorithm error rate, if e<0.5 so a>0,e smaller (the fewer errors) A is larger (the higher the weight), the algorithm in the final results about the discourse.

"NOTE" In this article, the algorithm itself is not important, it is important, what is ensemble Methods, and how to do the boost process. Once the loss function is determined, we can use the partial derivative + find the gradient direction to handle the boost direction.

Reference article: http://blog.csdn.net/dark_scope/article/details/24863289
http://blog.csdn.net/dark_scope/article/details/14103983 the Elements of statistical learning

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Ensemble methods (combination method, integrated method)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Ensemble methods (combination method, integrated method)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support