Three methods are combinatorial methods, the combination method is to use multiple classifiers to vote "construction of each classifier sample is obtained by a back-up sample"
1, Bagging (bagging): K sampling, Training k times, get K model (classifier), such as weight voting
2, Adaboost ( Promotion ): "In the sample extract, according to the right to take samples, by the right to vote"
1) K-sampling: the opportunity for each sample to be pumped is determined by its weight (the initial sample weight is the same)
2) Training K times: After each training to use to get the model error rate (<0.5) to modify the sample weights, improve those of the previous round weak classifier error classification samples of weights, reduce the correct classification of the weight of the sample.
3) Get K models: the weights of each model are determined by their error rate
4) Weighted vote "cumulative".
3. RF ( random forest ) "Fuss on Split attribute":
1) K-sampling (with back sampling)
2) Training K-Times: (Independent training, can be used in distributed computing)
Each training is to get a decision tree, you can use any one (not pruning) decision Tree Algorithm (Id3/c45/cart), but the split node uses the candidate features are no longer all features.
Rorest-ri: Random selection of F features as node partitioning candidate features.
FOREST-RC: A random selection of L features, and a random selection factor from [ -1,1] to add a feature. This method obtains the F feature, which is branched using a cart
3) Get K models: K Tree Decision Tree
4) equal right to vote.
Comparison:
1, over-fitting problems
1) Bagging: No overfitting: Because each classifier is independent, training data is different (row sampling)
2) Adaboost: There is an over fitting: because it always focuses on the wrong sample
3) RF: No over-fitting: Because each classifier is independent, not only the training data is different (row sampling), and the characteristics are not the same (column sampling, similar to domain experts)
2. Performance comparison
1) adaboost accuracy is higher than bagging, but there is an over fitting
2) RF accuracy and adaboost similar, but more robust, no over-fitting, f-value-sensitive "usually take log2d+1, general f=1 effect is very good"
Rough understanding of BAGGING-ADABOOST-RF