In simple terms, the combined algorithms of multiple classifiers are commonly used voting, bagging, and boosting. In terms of performance, boosting is slightly dominant, while the merge stm1 algorithm is equivalent to the "classic" of the boosting algorithm ".
The voting idea is to use multiple classifiers for voting combination and decide the final classification based on the minority obeying the majority (in most cases, the disadvantage is that a few rules that follow the majority can only avoid the worst cases, but it is also difficult to achieve the least.
The idea of bagging is to train multiple classifiers with random sampling, and finally use voting for voting decisions. classical algorithms such as randomforest (which has been analyzed by a blog before ), the disadvantage is that each base classifier is homogeneous. In terms of accuracy, bagging is not an algorithm that significantly improves the accuracy, but an algorithm that prevents overfitting.
Boosting uses cascading training classifier, so that "next-level" classifier attaches more importance to "next-level" error-tolerant data. Finally, it uses weighted combinations of classifier results for decision-making, the disadvantage is that cascade training is required, so the algorithm is difficult to parallelize. Typical algorithms include ipvstm1 and gbdt.
I. Algorithms
The algorithm does not provide a detailed formal description and proof of the correctness of relevant theories. The references are from Wikipedia.
Http://zh.wikipedia.org/wiki/AdaBoost
Algorithm process:
(1) initialize the weights of each instance in the training set to 1/K (assuming a total of K Use Cases)
(2) for I = 1; I <= m; I ++ (assuming the total number of base classifiers is m)
(3) re-sample the training set based on the weight to obtain the new training set.
(4) Train the base classifier I based on the new training set
(5) evaluate the accuracy of the base classifier R
(6) If the accuracy of R is less than 50%, exit directly and the training fails (this refers to binary classification, and multi-value Classification is similar to expansion)
(7) set the I weight of the base classifier to log2 (1-r/R)
(8) W = 1-r/R (because r> 0.5, W> 1). For instances with incorrect classification, multiply the weight of the original training set by W (that is, increase the weight ).
(9) Normalize the weight of the training set (reduce the weight by a certain number and make it equal to 1)
(10) return to 2
We can see that according to this algorithm, the base classifier is obtained after Step 4 training, and each base classifier gets a weight in Part 7, at last, weighted voting is performed based on the results of each base classifier to obtain the final result.
II. Implementation
When analyzing each classifier, we start with buildclassifier, which is no exception this time.
Public void buildclassifier (instances data) throws exception {super. buildclassifier (data); // a tool in WEKA is used to create a classifier by means of a deep copy from a classifier, while javasstm1 is used as a combine for multiple classifier, this deep copy must copy each base classifier, which is implemented in the super class. // Check whether the data can be classified by AdaBoost. From the Code, only enumeration types can be processed, but I think this is determined by the ability of the base classifier. Getcapabilities (). testwithfail (data); // pre-process data = new instances (data); data. deletewithmissingclass (); // if there is only one attribute column, that is, the attribute of the category itself, then no complicated classification method can be used, but the mzero model is used, the mzero model simply returns the most frequently seen values in the enumerated values. If (data. numattributes () = 1) {system. Err. println ("cannot build model (only class attribute present in Data !), "+" Using zeror model instead! "); M_zeror = new WEKA. classifiers. rules. zeror (); m_zeror.buildclassifier (data); return;} else {m_zeror = NULL;} m_numclasses = data. numclasses (); If ((! M_useresampling) & (m_classifier instanceof weightedinstanceshandler) {buildclassifierwithweights (data); // if the base classifierwithweights (data) is a weight-sensitive classifiher, otherwise, many common weighted-sensitive classifiers are implemented, such as j48, randomtree, randomforest, and bagging.} Else {buildclassifierusingresampling (data); // otherwise, the deduplication method is used. }}
(To be continued)
Source code analysis of WEKA algorithm Classifier-meta-AdaBoostM1 (I)