this time, I'm talking about Ensemble Something, a reader wants me to talk about Adaboost content, this Ensemble looks really scary, recommend a paper:Ensemble Based Systems in decision Making. I'm not going to introduce all the theories here.
as usual, look first The buildclassifier function (i remove all unimportant code from the function):
Super . Buildclassifier (data);
if ((! m_useresampling )&& (m_classifier instanceof weightedinstanceshandler)) {
Buildclassifierwithweights (data);
} else {
buildclassifierusingresampling (data);
}
adaboost randomizableiteratedsingleclassifierenhancer buildclasssifier function content:
m_classifiers = Classifier. makecopies (m_classifier, m_numiterations);
This sentence is produced m_numiterations a base classifier.
the next step is to decide whether to use resampling method, this first does not speak, directly look at the following function, first speaking buildclassifierwithweights This function, the code is too long, I say separately:
Select instances to train the classifier on
if (m_weightthreshold <) {
traindata = selectweightquantile (training,(double) m_weightthreshold /+);
} else {
Traindata = New Instances (training, 0, numinstances);
}
The top of the loop m_classifiers . length the cycle is not sticky, this paragraph is not interesting, that is, how much sample training first, the default is - , not - the time to use Selectweightquantile function, which is selected based on the ratio of the sample weights, which first sorts the samples based on the weights and then selects the samples. If this is the default, then of course, select all samples.
Build the classifier
if ( m_classifiers [ m_numiterationsperformed " instanceof randomizable)
((randomizable) m_classifiers[m_numiterationsperformed])
. Setseed (Randominstance.nextint ());
m_classifiers [m_numiterationsperformed] . Buildclassifier (traindata);
A classifier is a randomizable instance, then it is seeded, then a classifier is trained.
//Evaluate the classifier
evaluation = New Evaluation (data);
Evaluation.evaluatemodel (m_classifiers[m_numiterationsperformed], training);
Epsilon = Evaluation.errorrate ();
//Stop If error too small or error too big and ignore this model
if (Utils. Groreq(Epsilon, 0.5) | | Utils. eq (epsilon, 0)) {
if (m_numiterationsperformed = = 0) {
//If we ' re the first we have a to and use it
m_numiterationsperformed = 1;
}
break;
}
This paragraph, the person who has read the paper should be more clear what meaning, here also does not explain.
//Determine the weight to assign -this model
M_betas [m_numiterationsperformed] = Math. Log ((1-epsilon)/epsilon);
Reweight = (1-epsilon)/epsilon;
//Update instance weights
Setweights (training, reweight);
the first line of code corresponds to the formula I just mentioned in Figure 5 , the second sentence corresponds to the formula, the last sentence corresponds to the formula The following is a detailed look at:
Oldsumofweights = Training.sumofweights ();
Enumeration ENU = Training.enumerateinstances ();
while (Enu.hasmoreelements ()) {
Instance Instance = (Instance) enu.nextelement ();
if (! Utils. eq (m_classifiers[m_numiterationsperformed].
Classifyinstance (instance), Instance.classvalue ()))
Instance.setweight (Instance.weight () * reweight);
}
//renormalize weights
Newsumofweights = Training.sumofweights ();
ENU = training.enumerateinstances ();
while (Enu.hasmoreelements ()) {
Instance Instance = (Instance) enu.nextelement ();
Instance.setweight (Instance.weight () * oldsumofweights/newsumofweights);
}
This code is also adaboost if Span style= "font-size:13px;color:black;font-family: Song Body" is to determine whether the classification error, if the classification is wrong, in the original weight of the superior reweight (this place I am still a little unclear, the original paper should be classified correctly by the reweight 14 ).
The bottom part of the code is re-normalized weight, nothing special. As for the other function buildclassifierusingresampling, think about it or not, compared to this function there is no special place.
Weka Development [14]-adaboost Source code Introduction