Machine learning Python Implementation AdaBoost

Source: Internet
Author: User

AdaBoost is boosting method The most popular version number in multiple version numbers, which is constructed by constructing multiple weak classifiers. The result of the classification is obtained by weighting the results of each classifier. The process of building multiple classifiers here is also fastidious, by focusing on the data that the classifier has previously constructed to get the wrong number of classifiers.

This multi-classifier is very easy to get convergent during training.

This paper mainly introduces the construction of weak classifier by single-layer decision tree. Similarly, it is possible to construct weak classifiers with other classification algorithms.

The boost algorithm family originates from PAC Learnability (Pac-Learning). This set of theories focuses on when a problem can be learned and, of course, explores a detailed learning algorithm for the problem that can be learned.


At the same time, Valiant and Kearns first put forward the equivalence of weak learning algorithm and strong learning algorithm in PAC learning Model, that is, it is possible to give a weak learning algorithm that is only slightly better than random heuristic, can it be promoted to a strong learning algorithm? Assuming that the two are equivalent, it is only necessary to find a weak learning algorithm that is slightly better than the random one to promote it to a strong learning algorithm, without having to find a strong learning algorithm that is very difficult to obtain.

PAC defines the strength of the learning algorithm

Weak Learning Algorithm---Recognition error rate is less than 1/2 (that is, the accuracy rate is only slightly higher than the random learning algorithm)

Strong Learning Algorithm---Recognition of high accuracy and can be completed in polynomial time learning algorithm


Introducing the boost algorithm, let's introduce boostrapping and bagging algorithm

1) main process of bootstrapping method

Main steps:

i) sample n samples from a sample set D repeatedly

II) For each sample of the sub-sample set, statistical learning, to obtain if Hi

III) If a number of combinations are formed, finally if Hfinal

IV) will be finally if used for detailed classification tasks

2) The main process of bagging method-----Bagging can have a variety of extraction methods

Main ideas:

i) Training classifier

From the overall sample collection. Sampled N < n samples for sampling set training classifier Ci

II) the classifier to vote, and finally the result is the winner of the classifier vote

However, both of these methods. is simply a combination of classifiers. In fact, it does not play the power of the classifier combination.



AdaBoostThe algorithm is able to use a random weak classifier as the basis, the sample here is mainly through a single-layer decision tree to achieve, here, the single-layer decision tree, compared to the previous decision tree, a lot simpler, not through the calculation of information gain and other methods to select the feature set, and directly using a three-layer loop

AdaBoost full name is adaptive boosting (adaptive boosting), first, each sample in the training data is attached a weight, these weights constitute vector d. Start by initializing these weights to the same value. During the first training, the weights are the same. Same as the original training method. After the training is over, the rate of error is based on training. Once again the weights are assigned, and the weights of the first-pair samples are reduced. The wrong sample weights are increased so that the second classifier is trained, each classifier has an alpha weight value, where alpha is for the classifier, and the previous d is for the sample. Finally, a series of weak classifiers are trained, and the result of each classifier is multiplied by the weight value Alpha and then summed, which is the final classification result.

Self-adapting to the body is now here. Through the optimization of D, the final result tends to converge at high speed.

Here the definition of the error rate is as follows:

Error rate = Number of samples not correctly categorized/total number of samples

The alpha definition is as follows:


The update function for weight d is as follows:

There are two types of situations

1. The sample is correctly categorized:




2. The sample is not properly categorized:


Here I represents the I sample, T is for the first T training.

The complete adaboost algorithm such as the following

Here is an example of a Python implementation:

#-*-coding:cp936-*-"Created on Nov, 2010Adaboost was short for Adaptive Boosting@author:peter" from NumPy Import        *def loadsimpdata (): Datmat = Matrix ([[[1., 2.1], [2., 1.1], [1.3, 1.], [1., 1.],    [2., 1.]])  Classlabels = [1.0, 1.0, -1.0, -1.0, 1.0] return datmat,classlabelsdef loaddataset (fileName): #general function to Parse tab-delimited Floats numfeat = Len (open (FileName). ReadLine (). Split (' \ t ')) #get number of fields Datamat = [ ]; Labelmat = [] fr = open (fileName) for line in Fr.readlines (): Linearr =[] curline = Line.strip (). Split        (' \ t ') for I in Range (numFeat-1): Linearr.append (float (curline[i])) Datamat.append (Linearr) Labelmat.append (float (curline[-1))) return datamat,labelmat# characteristics: dimen, the threshold of the classification is threshval, the corresponding size value of classification is Threshineqdef Stumpclassify (DATAMATRIX,DIMEN,THRESHVAL,THRESHINEQ): #just classify the data Retarray = Ones ((Shape (Datamatrix) [0],1 )) If THreshineq = = ' lt ': Retarray[datamatrix[:,dimen] <= threshval] = -1.0 else:retarray[datamatrix[:,dimen ] > Threshval] = -1.0 return Retarray #构建一个简单的单层决策树, as the weak classifier #d as the weight of each sample, as the last calculation error when the function of the polynomial product # three-layer loop # First loop, each feature in the feature to cycle. Select the partitioning feature of the single-layer decision tree # to loop the step size and select the threshold # pair greater than. Less than switching def buildstump (dataarr,classlabels,d): Datamatrix = Mat (Dataarr); Labelmat = Mat (classlabels). T m,n = shape (datamatrix) numsteps = 10.0; Beststump = {}; Bestclasest = Mat (Zeros ((m,1))) #numSteps作为迭代这个单层决策树的步长 Minerror = inf #init error sum, to +infinity for I in range (n): #loop over all dimensions rangemin = Datamatrix[:,i].min (); RangeMax = Datamatrix[:,i].max (); #第i个特征值的最大最小值 stepsize = (rangemax-rangemin)/numsteps for J in range ( -1,int (numsteps) +1): #loop over all range in dimension to inequal in [' Lt ', ' GT ']: #go-over-less than and GRE Ater than Threshval = (rangemin + float (j) * stepsize) Predictedvals = stumpclassify (datamatrix,i,threshval,inequal) #call stump classify with I, j, LessThan Errarr = Mat (Ones ((m,1)))                Errarr[predictedvals = = Labelmat] = 0 Weightederror = D.t*errarr #calc total error multiplied by D #print "Split:dim%d, Thresh%.2f, Thresh ineqal:%s, the weighted error is%.3f"% (I, Threshval, inequal,                    Weightederror) If weightederror < Minerror:minerror = Weightederror Bestclasest = predictedvals.copy () beststump[' dim ' = I beststump[' thresh '] = THR  Eshval beststump[' ineq '] = inequal return beststump,minerror,bestclasest# training process based on adaboost of a single-layer decision tree #numit Number of cycles, representing construction of 40 single-layer decision Trees Def adaboosttrainds (dataarr,classlabels,numit=40): Weakclassarr = [] m = shape (Dataarr) [0] D = Mat (Ones (m,1)/m) #init D to all equal aggclassest = Mat (Zeros ((m,1))) for I in Range (Numit): Beststump,er Ror,classest = BuildstumP (dataarr,classlabels,d) #build Stump #print "D:", d.t alpha = float (0.5*log (1.0-error)/max (error,1e-16)) #ca LC Alpha, Throw in Max (error,eps) to account for error=0 beststump[' alpha "= Alpha Weakclassarr.append (be Ststump) #store Stump Params in Array #print "classest:", classest.t expon = Multiply ( -1*al Pha*mat (Classlabels).  T,classest) #exponent for D Calc, getting messy d = Multiply (D,exp (expon)) #Calc New D For next Iteration D = d/d.sum () #calc training error of any classifiers, if this is 0 quit for loop early Aggclassest + = Alpha*classest #print "aggclassest:", aggclassest.t aggerrors = Multiply (s IGN (aggclassest)! = Mat (classlabels). T,ones ((m,1))) #这里还用到一个sign函数.  The main is the probability can be mapped to -1,1 type errorrate = Aggerrors.sum ()/m print "Total error:", errorrate if errorrate = = 0.0: Break Return Weakclassarr,aggclassestdef Adaclassify (Dattoclass,classifierarr): Datamatrix = Mat (dattoclass) #do stuff similar to last aggclassest in adaboosttrainds m = Shape (Datamatrix) [0] aggclassest = Mat (zeros (m,1))) for I in Range (len (Classifierarr)): Classest = Stumpclas                                 Sify (datamatrix,classifierarr[i][' Dim '), classifierarr[i][' Thresh ', classifierarr[i][' ineq ') #call Stump classify aggclassest + classifierarr[i][' alpha ']*classest p    Rint aggclassest return sign (aggclassest) def plotroc (Predstrengths, classlabels): Import Matplotlib.pyplot as Plt Cur = (1.0,1.0) #cursor ysum = 0.0 #variable to calculate AUC Numposclas = SUM (Array (classlabels) ==1.0) Ystep = 1/float (Numposclas);    XStep = 1/float (len (classlabels)-numposclas) sortedindicies = Predstrengths.argsort () #get sorted index, it ' s reverse Fig = Plt.figure () fig.clf () ax = Plt.subplot (111) #loop through all the values, drawing a line segment at each PoiNT for index in Sortedindicies.tolist () [0]: if classlabels[index] = = 1.0:delx = 0;        dely = Ystep; Else:delx = XStep;            dely = 0; Ysum + = cur[1] #draw line from cur to (cur[0]-delx,cur[1]-dely) Ax.plot ([cur[0],cur[0]-delx],[cur[1],cur[1]- Dely], c= ' b ') cur = (cur[0]-delx,cur[1]-dely) ax.plot ([0,1],[0,1], ' b--') Plt.xlabel (' False positive rate ');     Plt.ylabel (' True positive rate ') plt.title (' ROC curve for AdaBoost horse colic Detection system ') Ax.axis ([0,1,0,1]) Plt.show () print "The area under the Curve is:", ysum*xstep


Machine learning Python Implementation AdaBoost

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.