from Import Ensemble
Integrated classifier (Ensemble):
1.bagging (Ensemble.bagging.BaggingClassifier)
Set up a basic classifier for randomly selected sub-sample sets, and then vote to determine the final classification results.
2.RandomForest (Ensemble. Randomforestclassifier)
Set M cart (Classifier and Regression Tree) for randomly selected sub-sample sets, then vote to determine the final classification result
The meaning of the random here:
1) Random Selection sub-sample set in Bootstrap
2) The random subspace algorithm randomly selects K attributes from the attribute, and each tree node splits from this random K attribute, choosing the optimal one.
3.Boosting (ensemble.weight_boosting)
A weighted value is added to the sample when the classifier is selected, allowing the loss function to take into account the samples of the sub-class as much as possible. (i.e. sample of the wrong class weight large)
-boosting resampling is not the sample, but the distribution of the sample. The final classification result is a linear weighted sum of several weakly classifiers. Note that these weak classifiers are a base classifier category.
-The difference with bagging: the training set of bagging is random, each training set is independent, and the selection of boosting training set is not independent, and each selected training set depends on the result of the last learning;
Each predictive function of bagging (i.e. weak hypothesis) has no weight, and boosting the weight of the prediction function according to the training error of each training;
The various predictive functions of bagging can be generated in parallel, while boosting are only sequential. For such a time-consuming learning method as a neural network, bagging can save a significant amount of time overhead through parallel training.
Represents algorithms Adaboost and Realboost. Overall, Adaboost simple and easy to use, realboost accurate
Reference article: http://blog.csdn.net/abcjennifer/article/details/8164315
Several integrated classifiers in Python