Machine learning Python Implementation AdaBoost

Source: Internet
Author: User

AdaBoost is boosting one of the most popular versions of the method is to build multiple weak classifiers, weighted by the results of each classifier, to get the classification results. The process of building multiple classifiers here is also fastidious, by focusing on the data that the classifier has previously constructed to get the wrong number of classifiers. Such multiple classifiers can be easily convergent during training.

This paper mainly introduces the construction of weak classifier by single-layer decision tree, in the same vein, the weak classifier can be constructed by other classification algorithms.

The boost algorithm family originates from PAC Learnability (Pac-Learning). This set of theories focuses on when a problem can be learned and, of course, explores specific learning algorithms that can be used for learning problems.

At the same time, Valiant and Kearns first put forward the equivalence of weak learning algorithm and strong learning algorithm in PAC learning Model, that is, any given weak learning algorithm which is slightly better than random guessing, can it be promoted to strong learning algorithm? If the two are equivalent, then it is necessary to find a weak learning algorithm that is slightly better than random guessing, which can be promoted to a strong learning algorithm instead of finding strong learning algorithms that are difficult to obtain.

PAC defines the strength of the learning algorithm

Weak Learning Algorithm---Recognition error rate is less than 1/2 (that is, the accuracy rate is only slightly higher than the random guessing learning algorithm)

Strong Learning Algorithm---Recognition of high accuracy and can be completed in polynomial time learning algorithm


Introducing the boost algorithm, let's introduce boostrapping and bagging algorithm

1) main process of bootstrapping method

Main steps:

i) sample n samples repeatedly from a sample set D

II) Statistical learning for the set of sub-samples per sample to obtain the hypothesis Hi

III) combine several assumptions to form the final hypothetical Hfinal

IV) Use the final assumptions for specific classification tasks

2) The main process of bagging method-----Bagging can have a variety of extraction methods

Main ideas:

i) Training classifier

Sampling N < n samples from the overall sample Set training classifier CI for sampled sets

II) the classifier to vote, the final result is the winner of the classifier vote

However, both of these methods simply combine the classifiers, and in fact, do not play the power of the classifier combination.



AdaBoostThe algorithm can be used as the basis of any weak classifier, the example here is mainly through a single-layer decision tree to achieve, here the single-layer decision tree, relative to the previous decision tree, a lot simpler, not by computing information gain and other methods to select the feature set, and directly using a three-layer loop

AdaBoost full name is adaptive boosting (adaptive boosting), first, each sample of the training data is attached a weight, these weights constitute vector D, at the beginning to initialize these weights to the same value. The first training, the same weight, and the same as the original training method, after training, according to the training error rate, the weight of the first sub-pair will be reduced, the wrong sample weight will increase, so that the second classifier training, each classifier corresponds to an alpha weight value, The alpha here is for the classifier, and the previous d is for the sample. Finally, a series of weak classifiers are trained, and the result of each classifier is multiplied by the weight value Alpha and then summed, which is the final classification result. Adaptive is reflected here, through the optimization of D over and over again, the final result can often be quickly convergent.

The error rate here is defined as follows:

Error rate = Number of samples not correctly categorized/total number of samples

Alpha is defined as follows:


The update function for weight d is as follows:

There are two types of situations

1. The sample is correctly categorized:




2. The sample is not properly categorized:


Here I represents the I sample, T is for the first T training.

The complete AdaBoost algorithm is as follows

Here is an example of a Python implementation:

#-*-coding:cp936-*-"Created on Nov, 2010Adaboost was short for Adaptive Boosting@author:peter" from NumPy Import        *def loadsimpdata (): Datmat = Matrix ([[[1., 2.1], [2., 1.1], [1.3, 1.], [1., 1.],    [2., 1.]])  Classlabels = [1.0, 1.0, -1.0, -1.0, 1.0] return datmat,classlabelsdef loaddataset (fileName): #general function to Parse tab-delimited Floats numfeat = Len (open (FileName). ReadLine (). Split (' \ t ')) #get number of fields Datamat = [ ]; Labelmat = [] fr = open (fileName) for line in Fr.readlines (): Linearr =[] curline = Line.strip (). Split        (' \ t ') for I in Range (numFeat-1): Linearr.append (float (curline[i])) Datamat.append (Linearr) Labelmat.append (float (curline[-1))) return datamat,labelmat# characteristics: dimen, the threshold for classification is threshval, the size value of the classification is Threshineqdef Stumpclassify (DATAMATRIX,DIMEN,THRESHVAL,THRESHINEQ): #just classify the data Retarray = Ones ((Shape (Datamatrix) [0],1 )) If THreshineq = = ' lt ': Retarray[datamatrix[:,dimen] <= threshval] = -1.0 else:retarray[datamatrix[:,dimen ] > Threshval] = -1.0 return Retarray #构建一个简单的单层决策树, as the weak classifier #d as the weight of each sample, as the last calculation of the error when the function of the polynomial product # three-layer loop # The first layer of the loop, the characteristics of each special Sign to cycle, select the single-layer decision tree Division features # to cycle the step, select the threshold # pair greater than, less than the switch def buildstump (dataarr,classlabels,d): Datamatrix = Mat (Dataarr); Labelmat = Mat (classlabels). T m,n = shape (datamatrix) numsteps = 10.0; Beststump = {}; Bestclasest = Mat (Zeros ((m,1))) #numSteps作为迭代这个单层决策树的步长 Minerror = inf #init error sum, to +infinity for I in range (n): #loop over all dimensions rangemin = Datamatrix[:,i].min (); RangeMax = Datamatrix[:,i].max (); #第i个特征值的最大最小值 stepsize = (rangemax-rangemin)/numsteps for J in range ( -1,int (numsteps) +1): #loop over all range in dimension to inequal in [' Lt ', ' GT ']: #go-over-less than and GRE Ater than Threshval = (rangemin + float (j) * stepsize) predictedvals = Stumpclassify (datamatrix,i,threshval,inequal) #call stump classify with I, j, LessThan Errarr = Mat (Ones ((m,1)))                Errarr[predictedvals = = Labelmat] = 0 Weightederror = D.t*errarr #calc total error multiplied by D  #print "Split:dim%d, Thresh%.2f, Thresh ineqal:%s, the weighted error is%.3f"% (I, Threshval, inequal,                    Weightederror) If weightederror < Minerror:minerror = Weightederror Bestclasest = predictedvals.copy () beststump[' dim ' = I beststump[' thresh '] = th Reshval beststump[' ineq '] = inequal return beststump,minerror,bestclasest# training process based on adaboost of a single-layer decision tree #numi  T loop number, indicating construction of 40 single-layer decision tree Def adaboosttrainds (dataarr,classlabels,numit=40): Weakclassarr = [] m = shape (Dataarr) [0] D = Mat (Ones (m,1)/m) #init D to all equal aggclassest = Mat (Zeros ((m,1))) for I in Range (Numit): beststump,e Rror,classest = BuildstuMP (Dataarr,classlabels,d) #build Stump #print "D:", d.t alpha = float (0.5*log (1.0-error)/max (error,1e-16)) #c ALC Alpha, throw in Max (error,eps) to account for error=0 beststump[' alpha '] = Alpha Weakclassarr.append (b Eststump) #store Stump Params in Array #print "classest:", classest.t expon = Multiply ( -1*a Lpha*mat (Classlabels).  T,classest) #exponent for D Calc, getting messy d = Multiply (D,exp (expon)) #Calc New D For next Iteration D = d/d.sum () #calc training error of any classifiers, if this is 0 quit for loop early Aggclassest + = Alpha*classest #print "aggclassest:", aggclassest.t aggerrors = Multiply (s IGN (aggclassest)! = Mat (classlabels). T,ones ((m,1))) #这里还用到一个sign函数, mainly the probability can be mapped to -1,1 type errorrate = Aggerrors.sum ()/m print "Total error:", Errorr ate if errorrate = = 0.0:break return weakclassarr,aggclassestdef adaclassify(Dattoclass,classifierarr): Datamatrix = Mat (dattoclass) #do stuff similar to last aggclassest in adaboosttrainds m = Shape (Datamatrix) [0] aggclassest = Mat (zeros (m,1))) for I in Range (len (Classifierarr)): Classest = Stumpcla                                 Ssify (datamatrix,classifierarr[i][' Dim '), classifierarr[i][' Thresh ',        classifierarr[i][' ineq ') #call Stump classify aggclassest + classifierarr[i][' alpha ']*classest    Print Aggclassest return sign (aggclassest) def plotroc (Predstrengths, classlabels): Import Matplotlib.pyplot as Plt  Cur = (1.0,1.0) #cursor ysum = 0.0 #variable to calculate AUC Numposclas = SUM (Array (classlabels) ==1.0) Ystep = 1/float (Numposclas);    XStep = 1/float (len (classlabels)-numposclas) sortedindicies = Predstrengths.argsort () #get sorted index, it ' s reverse Fig = Plt.figure () fig.clf () ax = Plt.subplot (111) #loop through all the values, drawing a line segment at each Point for index in sortedindicies.tolist () [0]: if classlabels[index] = = 1.0:delx = 0;        dely = Ystep; Else:delx = XStep;            dely = 0; Ysum + = cur[1] #draw line from cur to (cur[0]-delx,cur[1]-dely) Ax.plot ([cur[0],cur[0]-delx],[cur[1],cur[1]- Dely], c= ' b ') cur = (cur[0]-delx,cur[1]-dely) ax.plot ([0,1],[0,1], ' b--') Plt.xlabel (' False positive rate ');     Plt.ylabel (' True positive rate ') plt.title (' ROC curve for AdaBoost horse colic Detection system ') Ax.axis ([0,1,0,1]) Plt.show () print "The area under the Curve is:", ysum*xstep


Machine learning Python Implementation AdaBoost

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.