Rough understanding of BAGGING-ADABOOST-RF

Source: Internet
Author: User
Tags to domain

Three methods are combinatorial methods, the combination method is to use multiple classifiers to vote "construction of each classifier sample is obtained by a back-up sample"

1, Bagging (bagging): K sampling, Training k times, get K model (classifier), such as weight voting

2, Adaboost ( Promotion ): "In the sample extract, according to the right to take samples, by the right to vote"

1) K-sampling: the opportunity for each sample to be pumped is determined by its weight (the initial sample weight is the same)

2) Training K times: After each training to use to get the model error rate (<0.5) to modify the sample weights, improve those of the previous round weak classifier error classification samples of weights, reduce the correct classification of the weight of the sample.

3) Get K models: the weights of each model are determined by their error rate

4) Weighted vote "cumulative".

3. RF ( random forest ) "Fuss on Split attribute":

1) K-sampling (with back sampling)

2) Training K-Times: (Independent training, can be used in distributed computing)

Each training is to get a decision tree, you can use any one (not pruning) decision Tree Algorithm (Id3/c45/cart), but the split node uses the candidate features are no longer all features.

Rorest-ri: Random selection of F features as node partitioning candidate features.

FOREST-RC: A random selection of L features, and a random selection factor from [ -1,1] to add a feature. This method obtains the F feature, which is branched using a cart

3) Get K models: K Tree Decision Tree

4) equal right to vote.

Comparison:

1, over-fitting problems

1) Bagging: No overfitting: Because each classifier is independent, training data is different (row sampling)

2) Adaboost: There is an over fitting: because it always focuses on the wrong sample

3) RF: No over-fitting: Because each classifier is independent, not only the training data is different (row sampling), and the characteristics are not the same (column sampling, similar to domain experts)

2. Performance comparison

1) adaboost accuracy is higher than bagging, but there is an over fitting

2) RF accuracy and adaboost similar, but more robust, no over-fitting, f-value-sensitive "usually take log2d+1, general f=1 effect is very good"

Rough understanding of BAGGING-ADABOOST-RF

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.