1. The algorithm idea is simple: AdaBoost is an iterative algorithm, the core idea is to train different classifiers for the same training set, that is, weak classifiers, and then set up these weak classifiers to construct a stronger final classifier. (Three Stooges, the top of Zhuge Liang)
Its adaptation is that the sample of the previous basic classifier is strengthened, and the weighted sample is again used to train the next basic classifier. At the same time, a new weak classifier is added to each round until a predetermined small enough error rate is reached or a predetermined maximum number of iterations is reached.
2. Specifically, the entire AdaBoost iterative algorithm is 3 steps:
- Initializes the weight distribution of the training data. If there are N samples, each training sample is given the same weight at the very beginning: 1/n.
- Training weak classifiers. In the training process, if a sample point has been accurately classified, then in the construction of the next training set, its weight is reduced, conversely, if a sample point is not accurately classified, then its weight is improved. Then, the weight-updated sample set is used to train the next classifier, and the entire training process goes on so iteratively.
- The weak classifiers of each training are combined into strong classifiers. After the training process of each weak classifier is finished, the weight of the weak classifier with small classification error rate is enlarged, which plays a larger role in the final classification function, while the weight of the weak classifier with large classification error rate is reduced, which plays a smaller role in the final classification function. In other words, a weak classifier with a low error rate occupies a larger weight in the final classifier, otherwise it is smaller.
3. We can summarize some of the scenarios that are actually available for the adaboost algorithm:
1) Application scenarios for two classification or multi-classification
2) Baseline for classification tasks
No brain, simple, no overfitting, no need to tune the classifier
3) for Feature selection (feature selection)
4) The boosting framework is used to modify the Badcase
Only need to add new classifier, do not need to change the original classifier
Because the AdaBoost algorithm is a simple implementation, the application is also very simple algorithm. The AdaBoost algorithm obtains the strong classifier by combining the weak classifier, and the upper bound of the classification error rate decreases steadily with the increase of training, and does not cross the nature of the fitting and so on, it should be said to be an algorithm which is suitable for the application in various classification scenarios.
Initial knowledge of AdaBoost algorithm