The idea behind the meta-algorithm is a way to combine other algorithms, a
fromNumPyImport*defloadsimpdata (): Datmat= Matrix ([[1., 2.1], [ 2., 1.1], [ 1.3, 1. ], [ 1., 1. ], [ 2., 1. ]]) Classlabels= [1.0, 1.0,-1.0,-1.0, 1.0] returnDatmat,classlabelsdefLoaddataset (FileName):#General function to parse tab-delimited floatsNumfeat = Len (open (FileName). ReadLine (). Split ('\ t'))#get number of fieldsDatamat = []; Labelmat =[] FR=Open (FileName) forLineinchfr.readlines (): Linearr=[] CurLine= Line.strip (). Split ('\ t') forIinchRange (numFeat-1): Linearr.append (float (curline[i)) datamat.append (Linearr) labelmat.append (float (curline[< /c10>-1])) returnDatamat,labelmatdefStumpclassify (DATAMATRIX,DIMEN,THRESHVAL,THRESHINEQ):#just classify the dataRetarray = ones (Shape (datamatrix) [0],1)) ifThreshineq = ='LT': Retarray[datamatrix[:,dimen]<= Threshval] = 1.0Else: Retarray[datamatrix[:,dimen]> Threshval] = 1.0returnRetarraydefBuildstump (dataarr,classlabels,d): Datamatrix= Mat (Dataarr); Labelmat =Mat (classlabels). T M,n=shape (datamatrix) numsteps= 10.0; Beststump = {}; Bestclasest = Mat (Zeros (m,1)) Minerror= inf#init error sum, to +infinity forIinchRange (N):#loop over all dimensionsRangeMin = Datamatrix[:,i].min (); RangeMax =Datamatrix[:,i].max (); Stepsize= (rangemax-rangemin)/numsteps forJinchRange ( -1,int (numsteps) +1):#Loop over all range in current dimension forInequalinch['LT','GT']:#go over less than and greater thanThreshval = (rangemin + float (j) *stepsize) Predictedvals= Stumpclassify (datamatrix,i,threshval,inequal)#Call stump classify with I, J, LessThanErrarr = Mat (Ones (m,1)) Errarr[predictedvals= = Labelmat] =0 Weightederror= D.t*errarr#Calc Total error multiplied by D #print "Split:dim%d, Thresh%.2f, Thresh ineqal:%s, the weighted error is%.3f"% (I, Threshval, inequal, Weighteder ROR) ifWeightederror <Minerror:minerror=Weightederror bestclasest=predictedvals.copy () beststump['Dim'] =I beststump['Thresh'] =Threshval beststump['Ineq'] =inequalreturnbeststump,minerror,bestclasestdefAdaboosttrainds (dataarr,classlabels,numit=40): Weakclassarr=[] M=shape (Dataarr) [0] D= Mat (Ones ((m,1))/m)#Init D to all equalAggclassest = Mat (Zeros (m,1))) forIinchRange (numit): Beststump,error,classest= Buildstump (dataarr,classlabels,d)#Build Stump #print "D:", d.tAlpha = float (0.5*log (1.0-error)/max (error,1e-16))#Calc Alpha, throw in Max (error,eps) to account for error=0beststump['Alpha'] =Alpha Weakclassarr.append (beststump)#store Stump Params in Array #print "Classest:", classest.tExpon = Multiply ( -1*alpha*mat (classlabels). T,classest)#exponent for D Calc, getting messyD = Multiply (D,exp (expon))#Calc New D for next iterationD = d/d.sum ()#Calc Training error of all classifiers, if this is 0 quit for loop early ( use break)Aggclassest + = alpha*classest#print "Aggclassest:", aggclassest.tAggerrors = Multiply (sign (aggclassest)! = Mat (classlabels). T,ones ((m,1)) Errorrate= Aggerrors.sum ()/mPrint "Total Error:", ErrorrateifErrorrate = = 0.0: Break returnweakclassarr,aggclassestdefadaclassify (Dattoclass,classifierarr): Datamatrix= Mat (Dattoclass)#Do stuff similar to last aggclassest in Adaboosttraindsm =shape (Datamatrix) [0] Aggclassest= Mat (Zeros (m,1))) forIinchRange (len (Classifierarr)): Classest= Stumpclassify (datamatrix,classifierarr[i]['Dim'], classifierarr[i]['Thresh'], classifierarr[i]['Ineq'])#Call Stump classifyAggclassest + = classifierarr[i]['Alpha']*classestPrintaggclassestreturnSign (aggclassest)defPlotroc (Predstrengths, classlabels):ImportMatplotlib.pyplot as PLT cur= (1.0,1.0)#cursorYsum = 0.0#variable to calculate AUCNumposclas = SUM (Array (classlabels) ==1.0) Ystep= 1/float (Numposclas); XStep = 1/float (len (classlabels)-Numposclas) sortedindicies= Predstrengths.argsort ()#get sorted index, it ' s reverseFig =plt.figure () FIG.CLF () Ax= Plt.subplot (111) #Loop through all the values, drawing a line segment at each point forIndexinchsortedindicies.tolist () [0]:ifClasslabels[index] = = 1.0: Delx= 0; Dely =Ystep; Else: Delx= XStep; Dely =0; Ysum+ = Cur[1] #draw line from cur to (cur[0]-delx,cur[1]-dely)Ax.plot ([cur[0],cur[0]-delx],[cur[1],cur[1]-dely], c='b') cur= (cur[0]-delx,cur[1]-dely) Ax.plot ([0,1],[0,1],'b--') Plt.xlabel ('False Positive rate'); Plt.ylabel ('True Positive rate') Plt.title ('ROC curve for AdaBoost horse colic detection system') Ax.axis ([0,1,0,1]) plt.show ()Print "The area under the Curve is:", Ysum*xstep
Daboost is the most popular meta-algorithm and one of the most powerful tools in machine learning.
The combination of different algorithms can also be the same algorithm in different settings of the integration, can also be different parts of the data set assigned to different classifiers after the integration
Advantages: Low generalization error rate, easy coding, can be applied to most of the classifier, no parameters need to adjust
Cons: Sensitive to outliers.
Suitable for numerical nominal-scale data
Bagging is the technique of selecting S from the original dataset to get s new datasets, the new datasets are equal to the original dataset size, each dataset is randomly selected from the original dataset to be replaced by a sample, which allows the selection of duplicate values, while some values may not appear
After the S data is built, an algorithm is used for each data set to get the S classifier, when we classify the new data, we can use this s classifier to classify, select the classifier poll results of the most results as the final classification results
The more advanced bagging method is the random forest
Boosting is a technology similar to bagging, bagging is obtained through serial training, while boosting focuses on the part of the data that has been incorrectly divided by the classifier to obtain a new classifier.
The result of boosting is the result of weighted summation of all classifiers, bagging are equal weights, boosting weights are different, each weight represents the success of the classifier in the previous iteration
AdaBoost is one of the boosting.
The adaboost algorithm can be described in three steps:
(1) First, it is the weight distribution D1 of the initial training data. Assuming that there are N training sample data, each training sample is given the same weight at the very beginning: w1=1/n.
(2) Then, train the weak classifier hi. The specific training process is: If a training sample point, by the weak classifier Hi accurate classification, then in the construction of the next training set, its corresponding weight to reduce; Conversely, if a training sample point is incorrectly categorized, then its weight should be increased. The set of weights that have been updated is used to train the next classifier, and the entire training process goes on so iteratively.
(3) Finally, the weak classifiers of each training are combined into a strong classifier. After the training process of each weak classifier is finished, the weight of the weak classifier with small classification error rate is enlarged, which plays a larger role in the final classification function, while the weight of the weak classifier with large classification error rate is reduced, which plays a smaller role in the final classification function.
In other words, the weak classifier with low error rate occupies a larger weight in the final classifier, otherwise it is smaller.
Machine learning (using AdaBoost meta-algorithm to improve classification performance)