Introduction to the boosting algorithm

Source: Internet
Author: User

Turn from: http://blog.csdn.net/baiduforum/article/details/6721749 The development history of the boosting algorithm

Boosting algorithm is a method to integrate several classifiers into a classifier, before the boosting algorithm is produced, there are two more important methods to integrate multiple classifiers into a classifier, namely Boostrapping method and bagging method. Let's briefly introduce the bootstrapping method and the bagging method.

1) main process of bootstrapping method

Main steps:

i) sample n samples repeatedly from a sample set D

II) Statistical learning for the set of sub-samples per sample to obtain the hypothesis Hi

III) combine several assumptions to form the final hypothetical Hfinal

IV) Use the final assumptions for specific classification tasks

2) main process of bagging method

Main ideas:

i) Training classifier

Sampling N < n samples from the overall sample Set training classifier CI for sampled sets

II) the classifier to vote, the final result is the winner of the classifier vote

However, both of these methods simply combine the classifiers, and in fact, do not play the power of the classifier combination. Until 1989, Yoav Freund and Robert Schapire proposed a feasible method of combining weak classifiers into strong classifiers. And thus won the 2003-year Godel price.

Schapire also proposes an early boosting algorithm, the main process is as follows:

i) from the sample overall set D , do not put back the random sampling n1 < n samples, get set D1

Training weak classifier C1

II) Extract n2 < n samples from the overall sample set D , which are combined into half of the samples that were incorrectly classified by C1. Get sample Set D2

Training weak classifier C2

III) sample samples in D samples,C1 and C2 are classified as inconsistent, composed of D3

Training weak classifier C3

IV) vote with three classifiers to get the final classification result

By the year 1995, Freund and Schapire present the AdaBoost algorithm, whose main framework can be described as:

i) iterate over multiple iterations

Update Sample Distribution

Finding the optimal weak classifier under current distribution

Calculating the error rate of weak classifiers

II) A weak classifier that aggregates multiple workouts

The complete AdaBoost algorithm can be seen in:

Now, the boost algorithm has a lot of development, there are many other boost algorithms, such as: Logitboost algorithm, gentleboost algorithm and so on. In this report, we will highlight the process and characteristics of the adaboost algorithm.

Second, adaboost algorithm and analysis

From Figure 1.1, we can see a detailed algorithm procedure for AdaBoost. AdaBoost is a more characteristic algorithm, which can be summarized as follows:

1) Each iteration changes the distribution of the sample, not the repeated sampling (re weight)

2) Change in sample distribution depends on whether the sample is properly classified

Always classify the correct sample weights low

Always classify the wrong sample weights high (usually a sample near the boundary)

3) The final result is a weighted combination of weak classifiers

The weight value indicates the performance of the weak classifier

In simple terms, AdaBoost has many advantages:

1) AdaBoost is a classifier with high accuracy

2) A sub-classifier can be constructed using various methods, and the AdaBoost algorithm provides a framework

3) When using a simple classifier, the calculated results are understandable. And the weak classifier is extremely simple to construct.

4) Simple, no feature screening

5) Don't worry about overfitting!

In short: AdaBoost is simple and effective.

Let's take a simple example to see the implementation of AdaBoost:

In the diagram, "+" and "-" represent two categories, in which we use horizontal or vertical lines as classifiers to classify them.

The first step:

According to the correct rate of classification, a new sample distribution is obtained D2, a sub-classifier H1

One of the circle samples indicates that the sample was divided incorrectly. On the right side of the way, the larger "+" means the sample is weighted.

Step Two:

According to the correct rate of classification, a new sample distribution is obtained D3, a sub-classifier H2

Step Three:

Get a sub-classifier H3

Consolidate all sub-classifiers:

So you can get the results of integration, from the results, in a timely and simple classifier, combined together can also get a good classification effect, in the example of all.

Some of the features of the adaboost algorithm are very good, and in our report, we mainly introduce the two features of AdaBoost. The first is the upper bound of the error rate of training, which decreases gradually with the increase of the number of iterations, and the other is that the adaboost algorithm does not have the problem of fitting even if there are many training times.

These two features are described mainly by proving processes and diagrams:

1) Error rate upper bound descent characteristics

As you can see, the upper bound of the error rate decreases as the number of iterations increases.

2) No fitting phenomenon occurs

In general, the phenomenon of overfitting refers to the phenomenon described, that is, as the model training error decreases, the generalization error (test error) of the model is actually rising. The horizontal axis represents the number of iterations, and the longitudinal axes represent the value of the training error.

In fact, the adaboost algorithm is not observed, that is, when the training error is small to a certain extent, continue training, the return error will still not increase.

The interpretation of this phenomenon, with the help of the concept of margin, where margin is expressed as follows:

By introducing the concept of margin, we can observe the phenomena that occur:

From the left side of the graph can be seen, with the increase in training times, test error rate does not increase, and corresponding to the right of the sub-graph can be seen, with the increase in training times, margin has been increasing. This means that after the training error drops to a certain extent, more training will increase the classification margin of the classifier, and this process can also prevent the test error from rising.

Three, multi-classification AdaBoost

In daily tasks, we usually need to solve the problem of multi-classification. In the previous introduction, the AdaBoost algorithm can only be used in the case of two classification. Therefore, in this section, we focus on how to adjust the adaboost algorithm to a method suitable for handling multi-classification tasks.

There are currently three more commonly used methods of classifying the two adaboost.

1. AdaBoost M1 method

Main idea: Several weak classifiers of adaboost combination are the classifiers of multi-classification.

In training, the calculation method of the sample weight space is still:

When decoding, select one of the most likely categories

2, AdaBoost MH method

The main idea: the combination of the weak classifier is still a classifier of two classification, the classification of the label and the classification sample combination, generate n samples, in this new sample space training classifier.

Can be used to indicate its principle:

3. Binary encoding for multi-class output

Main idea: binary encoding of n labels, for example, a label with M-bit binary number. Then the M two classifier is trained to generate a binary number of M bits when decoding. Thus corresponds to a label.

Iv. Summary

Finally, we can summarize some of the scenarios that the AdaBoost algorithm can actually use:

1) Application scenarios for two classification or multi-classification

2) Baseline for classification tasks

No brain, simple, no overfitting, no need to tune the classifier

3) for Feature selection (feature selection)

4) The boosting framework is used to modify the Badcase

Only need to add new classifier, do not need to change the original classifier

Because the AdaBoost algorithm is a simple implementation, the application is also very simple algorithm. The AdaBoost algorithm obtains the strong classifier by combining the weak classifier, and the upper bound of the classification error rate decreases steadily with the increase of training, and does not cross the nature of the fitting and so on, it should be said to be an algorithm which is suitable for the application in various classification scenarios.

Introduction to the boosting algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.