Easy-to-learn machine learning algorithms-integration Methods (Ensemble method)

Source: Internet
Author: User

I. The idea of integrated learning method

This paper introduces a series of algorithms, each of which has different scopes of application, such as dealing with linear variational problems, and dealing with linear irreducible problems. In the real world life, often because the "collective wisdom" makes the problem is easy to solve, then the problem, in machine learning problems, for a complex task, can be a lot of machine learning algorithms together, so that the results will not be calculated than using a single algorithm performance better? The idea is to integrate learning methods.

The integrated learning approach refers to combining multiple models to achieve better results and to make the integrated model more generalization capable. For multiple models, there are several different ways to combine these models:

    1. The best-performing model is found on the validation data set as the final predictive model;
    2. The prediction results of multiple models are voted or averaged;
    3. The prediction results of multiple models are weighted average.
Some of the above ideas correspond to some of the main learning frameworks in integrated learning.

second, the main method of integrated learning1, strong can learn and weak can learnin the integrated learning method, multiple weak models are combined into a strong model by a certain combination of methods. In the method of statistical learning, the paper introduces the strong learning(strongly learnable)"and" weak to learn(weakly learnable)"concept. in probability approximate correct(probably approximately correct, PAC)Learning Framework, a concept (a class), if there is a polynomial learning algorithm can learn it, and the correct rate is very high, then called this concept is strong can learn. A concept, if there is a polynomial learning algorithm can learn it, learning the correct rate is only slightly better than random guessing, then called this concept is weak to learn. Schapirepointed out that inPACunder the framework of learning, a concept is a sufficient and necessary condition for strong learning that the concept is weak to learn. So for a learning problem, if we find "weak learning algorithm", then we can turn the weak learning method into "strong learning algorithm". 2. Find the best performing model on the verification setThe idea of such a method is similar to that of the decision tree, and the algorithm that satisfies the condition is chosen under different conditions. 3. Multiple models voting or averagingFor the data set training several models, for the classification problem, may adopt the voting method, chooses the most votes the category as the final category, but for the regression question, may adopt the mean value the method, obtains the mean value as the final result. The most famous of these ideas isBaggingmethod.BaggingthatBoostrap aggregating, of which,Boostrapis a kind of sampling method with put-back, and its sampling strategy is simple random sampling. in theBaggingmethod, the learning algorithm is trained several times, each training set consists of a training sample randomly taken from the initial training concentration, and the initial training sample may appear multiple times or not at all in a certain training set. Finally, a predictive function is trained, and the final prediction function can be used in two ways for classification and regression problems:
    1. Classification problem: The voting method, the category with the most votes is the final category
    2. Regression problem: Using a simple averaging method
(image from reference article 2)Random forest algorithm is based onBaggingThe learning algorithm of thought. 4. Weighted average of prediction results for multiple modelsin the above-mentionedBaggingin the method, it is characterized by random sampling, training new models by repeated sampling, and finally averaging on the basis of these models. The weighted average of the prediction results of multiple models is to promote multiple weak learning models to strong learning models, which isboosting's core ideas. in theboostingalgorithm, each training sample is given equal weight during initialization, such as, and then use the learning algorithm to train the training setround, after each training, to the training failure of the training sample to give greater weight, that is, the learning algorithm in the subsequent learning of a few more difficult to learn the training samples to learn, so as to get a predictive function sequence, each of whichhave a weight that predicts a good predictor function with a larger weight. The final predictive function can be used in two ways for classification and regression problems:
    1. Classification problem: The right to vote in a heavy way
    2. Regression problem: Weighted average
(image from reference article 2)AdaBoostand theGBDT (Gradient boosting decision Tree)is based onboostingThe two most famous algorithms of thought.


Reference articles

1, "Statistical learning method"

2. Statistical Learning methods-- CART, Bagging, Random Forest, boosting

Easy-to-learn machine learning algorithms-integration Methods (Ensemble method)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.