Algorithm principle
Compared to a single learner, the idea of integrating ensemble is to combine different classifiers to get a better (combined) model for forecasting. Depending on the implementation, the integration algorithm has many different forms:
- Integration of different algorithms
- Integration of different parameters (settings) of the same algorithm
- Working with different parts of a dataset
The integrated ensemble algorithm is mainly divided into bagging and Boosting,adaboost is the main representative of boosting algorithm, is "Adaptive boosting" (adaptive enhancement) abbreviation, by Yoav Freund and Robert Schapire was introduced in 1995. Its adaptation is that the sample of the previous basic classifier is strengthened, and the weighted sample is again used to train the next basic classifier. At the same time, a new weak classifier is added to each round until a predetermined small enough error rate is reached or a predetermined maximum number of iterations is reached.
The entire AdaBoost iterative algorithm is 3 steps:
- Initializes the weight distribution of the training data. If there are N samples, each training sample is given the same weight at the very beginning: 1/n.
- Training weak classifiers. In the training process, if a sample point has been accurately classified, then in the construction of the next training set, its weight is reduced, conversely, if a sample point is not accurately classified, then its weight is improved. Then, the weight-updated sample set is used to train the next classifier, and the entire training process goes on so iteratively.
- The weak classifiers of each training are combined into strong classifiers. After the training process of each weak classifier is finished, the weight of the weak classifier with small classification error rate is enlarged, which plays a larger role in the final classification function, while the weight of the weak classifier with large classification error rate is reduced, which plays a smaller role in the final classification function. In other words, a weak classifier with a low error rate occupies a larger weight in the final classifier, otherwise it is smaller.
AdaBoost algorithm Process
Reference:
Principle and derivation of "1" Adaboost algorithm
"2" boosting family of ML
Adaboot Algorithm Learning Notes