Machine Learning School Recruit Note 3: Integrated Learning adaboost_ Machine learning

Source: Internet
Author: User

The method of Ascension is to start from the weak learning algorithm, to learn, to get a series of weak classifier (basic classifier), and then combine these weak classifiers, build a strong classifier. Most of the lifting methods change the probability distribution (weight distribution) of training data, call the weak learning algorithm according to different training data distribution, and learn a series of weak classifiers.

Integration learning can be divided into two classes according to whether there is a dependency relationship between individual learners, the first one is the strong dependence between individual learners and the other is that there is no strong dependence between individual learners. The former's representative algorithm is the boosting series algorithm. In the boosting series algorithm, AdaBoost is one of the most famous algorithms. AdaBoost can be used either as a classification or as a regression.

First, a weak learner is trained from the training set with the initial weight 1 to update the weights of the training samples according to the learning error rate of the weak learning. The weights of the training sample points with high learning error rate of the weak learner 1 were higher, so that the points with high error rate were paid more attention in the weak learner 2. Then the weak learner 2 is trained based on the training set after adjusting the weights. So repeated, until the number of weak learners reached a predetermined number of T, and eventually the T-weak learner was integrated through the set strategy to get the final strong learner.

However, there are a few specific problems boosting algorithm is not specified in detail.

1 How to calculate the learning error rate E?

2 How to get the weak learner weight coefficient α?

3 How to update the sample weight d?

4 What combination strategy to use.

As long as it is the algorithm of boosting large family, we must solve these 4 problems. So how did adaboost solve it? The basic idea of adaboost algorithm

For ascension, answer two questions:

How each round changes the weight or probability distribution of the data

AdaBoost approach: Increase the weights of the previous round weak classifier error classification samples, and reduce the weights of those correctly categorized samples.

If the weak classifier is combined into a strong classifier

AdaBoost Practice: Using weighted majority vote to increase the weight of the weak classifier with small classification error rate, which makes it play a big role in voting, and reduces the weight value of large classification error rate. The classification problem of AdaBoost

The regression problem of AdaBoost

Examples of AdaBoost

See also: http://blog.csdn.net/tiandijun/article/details/48036025

In the diagram, "+" and "-" represent two categories, in which we use horizontal or vertical lines as classifiers to classify.

1. First step:

The algorithm first gives a uniform distribution of D. So the value of each point in the H1 is 0.1. OK, when divided, there are three points that are wrong, according to the error expression of the algorithm

Ɛ1 = 0.1 + 0.1 + 0.1 = 0.3

Ɑ1 = 0.42

According to the correct rate of classification, a new sample distribution D2, a sub classifier H1

Circle's sample represents a fault, and a larger "+" in the right image is weighted to the sample.

Update weight distribution:

Whether the new weight of each sample becomes larger or smaller depends on how it is divided or correctly divided.

That is, if a sample is wrongly divided, then Yi * Gm (xi) is negative negative positive, the result of the whole equation to become larger (the size of the sample weight), otherwise smaller.

The weight of the fault is getting bigger, and the weak classifier in the next round gets more attention.

After the first iteration, we get the new weights distribution of each data D2 = (0.0715, 0.0715, 0.0715, 0.0715, 0.0715, 0.0715, 0.1666, 0.1666, 0.1666, 0.0715). As can be seen, because the sample is the data "6 7 8" by G1 (x) wrong, so their weights from the previous 0.1 to increase to 0.1666, otherwise, all other data are divided correctly, so their weights are from the previous 0.1 to 0.0715.

The classification function f1 (x) = A1*G1 (x) = 0.42G1 (x).

At this point, the resulting first base classifier sign (F1 (x)) has 3 error classification points (i.e. 6 7 8) on the training dataset.

From the whole iterative process of the first round, it can be seen that the error rate of the weighted value of the sampled sample is the same as the influence of the base classifier in the final classifier. (The smaller the error rate, the greater the effect in the final model)

2. Step Two:

According to the correct rate of classification, a new sample distribution D3, a sub classifier H2

3. Step Three:

Consolidate all sub classifiers:

Exponential loss function (Adaboost)

The formula of weight coefficient and the updating formula of the sample weight for the classification AdaBoost

Adaboost:

Model for the addition model,

The model is well understood for the additive model, and our final strong classifier is obtained by a weighted average of several weak classifiers.

The learning algorithm is a forward-step learning algorithm

Our algorithm is to learn by the weak learner of a round wheel, using the result of the previous weak learner to update the weight of the training set of the weak learner. In other words, the strong learner of the k-1 wheel is

fk−1 (x) =∑i=1k−1αigi (x)

and the K-wheel strong learner is: FK (x) =∑i=1kαigi (x)

The last two comparisons can be obtained:

FK (x) =fk−1 (x) +αkgk (x)

The loss function is the classification problem of exponential function.

AdaBoost Advantages and Disadvantages

The type of the weak learner. Theoretically any learner can be used for adaboost. But generally speaking, the most widely used adaboost weak learner is decision tree and neural network. For decision trees, the AdaBoost classification uses the CART classification tree, and the AdaBoost regression uses the cart regression tree

The main advantages of AdaBoost are:

One of the biggest advantages is that you can automatically combine weak classifiers

1 AdaBoost as classifier, the classification accuracy is very high

2 in the framework of adaboost, a variety of regression classification models can be used to construct the weak learner, which is very flexible.

3) as a simple two-element classifier, the structure is simple and the result is understandable.

4) not easy to fit

The main disadvantages of AdaBoost are:

1 is sensitive to abnormal samples, the abnormal samples may get higher weights in the iterations, which will affect the prediction accuracy of the final strong learner.

See also: http://www.cnblogs.com/pinard/p/6133937.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.