Implementation of bagging and adaboost packages in R

Source: Internet
Author: User

The Adabag packages in R have functions that implement the classification modeling of bagging and adaboost (in addition, the bagging () function in the ipred package can implement bagging regression). The first problem is to use adabag package to achieve bagging and adaboost modeling, and select the optimal model based on the predicted results.

A) to describe both approaches, first build the model with all of the data:

Use boosting () (the original adaboost. M1 () function) Establish AdaBoost classification

Library (Adabag)

A=boosting (Species~.,data=iris) #建立adaboost分类模型

(Z0=table (Iris[,5],predict (A,iris) $class)) #查看模型的预测结果

The predictions for the model are all correct.

(E0= (SUM (z0)-sum (Diag (z0)))/sum (z0)) #计算误差率

[1] 0

From the results, the forecast error rate is 0.

Barplot (a$importance) #画出变量重要性图


It can be learned that the importance of each variable is: petal.length>petal.width>sepal.length>sepal.width

B<-errorevol (A,iris) #计算全体的误差演变

Plot (b$error,type= "L", main= "AdaBoost error vs Number of trees") #对误差演变进行画图


It can be learned that the error rate is 0 after the seventh iteration, and the prediction 0 error rate is realized.

Next, the bagging classification model is established using the bagging () function:

B=bagging (Species~.,data=iris) #建立bagging分类模型

(Z0=table (Iris[,5],predict (B,iris) $class)) #查看模型的预测结果

It can be seen that the bangging classification divides 3 versicolor into virginica, and divides 4 virginica into versicolor by mistake.

(E0= (SUM (z0)-sum (Diag (z0)))/sum (z0)) #计算误差率

[1] 0.04666667

The error rate is 0.047.

Barplot (b$importance) #画出变量重要性图


It can be learned that the importance of each variable is: petal.length>petal.width>sepal.length>sepal.width

In the case of full-scale modeling, compared with bagging and adaboost classification, the accuracy of adaboost classification is as high as 100%, which is obviously better than bagging classification.

b) Do 50 percent cross-validation below, giving only the classification average error rate for the training set and the test set:

Use boosting () (the original adaboost. M1 () function) Establish AdaBoost classification

Set.seed (1044) #设定随机种子
Samp=c (sample (1:50,25), sample (51:100,25), sample (101:150,25)) #进行随机抽样
A=boosting (Species~.,data=iris[samp,]) #利用训练集建立adaboost分类模



From the results, the prediction result of the training set is 100% correct, the error rate of the test set is 0.04, and 3 of the actual versicolor are divided into virginica.

Next, the bagging classification model is established using the bagging () function:

B=bagging (Species~.,data=iris[samp,]) #利用训练集建立bagging分类模型


Bagging to the training set of the prediction results of 2 of the actual virginica is divided into versicolor, The error rate is 0.027, the prediction result of the test set has 2 actual versicolor for the virginica, there are 2 virginica of the versicolor, the error rate is 0.053.

Conclusion: From the above prediction results, it is found that the adabag classification is better than the bagging classification for IRIS data set.


Implementation of bagging and adaboost packages in R

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.