Use AdaBoost to classify your own data

Last Update:2018-07-26 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Want to use adaboost classifier to their own data classification and online a look at a lot of adaboost+harr face detection what what, so can only write their own reference http://blog.csdn.net/zhaocj/article/details/ 50536385 uses the AdaBoost algorithm to classify its own data sets. Features and tags are CSV files: The feature matrix is 224x1000, the category is 2 is the two classification,

int main (int argc, char** argv)
{

Kuangvec[] and stonevec[] Two arrays of image labels in the scrambled order results
const int uselesssample = 134, Usefulsample = N, allsample = 224,featurecol=1000;
The first 90 in the scramble order is the 134 of the mine as scrap stone.
int kuangvec[usefulsample], stonevec[uselesssample];
for (int i = 0; i < usefulsample; i++)
Kuangvec[i] = i;
for (int i = 0; I <uselesssample; i++)
Stonevec[i] = usefulsample+i;
Random_shuffle (Kuangvec, kuangvec+usefulsample);
Random_shuffle (Stonevec, Stonevec + uselesssample);
Show fixed ranks
for (int i = 0; i < usefulsample; i++)
cout << Kuangvec[i] << Endl;
cout << Endl;
for (int i = 0; i < uselesssample; i++)
cout << Stonevec[i] << Endl;
Read features
Cvmldata Cvmlprimer;
Cvmlprimer.read_csv ("Features.csv");
Cv::mat cvml = Cv::mat (Cvmlprimer.get_values (), true);
Cvmldata Resprimer;
Resprimer.read_csv ("Labels.csv");
Cv::mat res = Cv::mat (Resprimer.get_values (), true);
Saperate into trainset
Const float rate = 0.6;
int Trainnum=int (usefulsample*rate) +int (uselesssample*rate), testnum=allsample-trainnum;
Mat Traindata=mat::zeros (Trainnum, Featurecol, Cvml.type ()), Testdata=mat::zeros (Testnum, Featurecol, Cvml.type ()), Trainlabel=mat::zeros (Trainnum, 1, Res.type ()), Testlabel=mat::zeros (testnum,1, Res.type ());
for (int i = 0; I <int (usefulsample*rate); i++)
{
float* NewRow = traindata.ptr<float> (i);
int currentrow = Kuangvec[i];
float* Primerow = cvml.ptr<float> (CurrentRow);
for (int j = 0; J < Featurecol; J + +)
NEWROW[J] = Primerow[j];
float* Newlabelrow = trainlabel.ptr<float> (i);
float* Primerlabelrow = res.ptr<float> (CurrentRow);
Newlabelrow[0] = primerlabelrow[0];
}
for (int i = int (usefulsample*rate); I <trainnum; i++)
{
float* NewRow = traindata.ptr<float> (i);
int II = I-int (usefulsample*rate);
int currentrow = Stonevec[ii];
float* Primerow = cvml.ptr<float> (CurrentRow);
for (int j = 0; J < Featurecol; J + +)
NEWROW[J] = Primerow[j];
float* Newlabelrow = trainlabel.ptr<float> (i);
float* Primerlabelrow = res.ptr<float> (CurrentRow);
Newlabelrow[0] = primerlabelrow[0];
}
Saperate into Testset
for (int i = 0; I <usefulsample-int (usefulsample*rate); i++)
{
float* NewRow = testdata.ptr<float> (i);
int III = i + int (usefulsample*rate);
int currentrow = KUANGVEC[III];
float* Primerow = cvml.ptr<float> (CurrentRow);
for (int j = 0; J < Featurecol; J + +)
NEWROW[J] = Primerow[j];
float* Newlabelrow = testlabel.ptr<float> (i);
float* Primerlabelrow = res.ptr<float> (CurrentRow);
Newlabelrow[0] = primerlabelrow[0];
}
for (int i = Usefulsample-int (usefulsample*rate); I <testnum; i++)
{
int II = INT (uselesssample*rate) + I-(Usefulsample-int (usefulsample*rate));
float* NewRow = testdata.ptr<float> (i);
int currentrow = Stonevec[ii];
float* Primerow = cvml.ptr<float> (CurrentRow);
for (int j = 0; J < Featurecol; J + +)
NEWROW[J] = Primerow[j];
float* Newlabelrow = testlabel.ptr<float> (i);
float* Primerlabelrow = res.ptr<float> (CurrentRow);
Newlabelrow[0] = primerlabelrow[0];
}
Trainset and Testset have done!/////////////////////////////////////////////////////////////////
for (int i = 0; i < trainnum; i++)//check the Trainlabel and TestLabel
//{
float* row = trainlabel.ptr<float> (i);
cout <<row[0]<< Endl;
//}
cout << "trainlabel=" <<trainnum << Endl;
for (int i = 0; i < testnum; i++)
//{
float* row = testlabel.ptr<float> (i);
cout << row[0]<<endl;
//}
Cvmat traindata2 =traindata, Trainlabel2 =trainlabel;
Const cvmat* TRAINDATA3 = Cvcreatemat (Traindata2.rows, Traindata2.cols, Traindata2.type);
Const cvmat* TRAINLABEL3 = Cvcreatemat (Trainlabel2.rows, Trainlabel2.cols, Trainlabel2.type);
printf ("Ready for Training ... ");
Float priors[1000] = {1, 1, 1,};
Cvboostparams params (Cvboost::gentle, ten, 0.95, 1, false, priors);
Cvboost boost;
BOOL update = FALSE;
Const cvmat* VARIDX = 0;
Const cvmat* SAMPLEIDX = 0;
Const cvmat* VarType = 0;
Const cvmat* Missingdatamask = 0;
Boost.train (Traindata3,cv_row_sample, Trainlabel3, Varidx, Sampleidx, VarType, Missingdatamask, params, update);
cout << "Training done!" << Endl;
1. Declare a couple of vectors to save the predictions of each sample
Std::vector train_responses, test_responses;
2. Calculate the training error
float FL1 = Boost.calc_error (&cvml, Cv_train_error, &train_responses);
3. Calculate the test error
float FL2 = Boost.calc_error (&cvml, Cv_test_error, &test_responses);
printf ("Error train%f \ n", FL1);
printf ("Error test%f \ n", FL2);
Save the trained classifier
Boost.save ("./trained_boost.xml", "boost");
return exit_success;
return 0;
}

I was just train, but with adaboost training, I said it was only used for two classification. But I checked my label and it turned out to be a two classification. Training tags are 1 and 22 classes are taken out 60% as the training set two classes remaining 40% as a test set but why did the train report such a mistake?

I finally knew why. From Traindata to cvmat* when the data is not copied just copy the information header so Trian inside is no data ... That's stupid.

Changed to this:

#include "opencv2/core/core.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/ml/ml.hpp"
#include <iostream>
#include <algorithm>
#include <opencv2/opencv.hpp>
using namespace CV;
using namespace Std;
int main (int argc, char** argv)
{
Kuangvec[] and stonevec[] Two arrays of image labels in the scrambled order results
const int uselesssample = 134, Usefulsample = N, allsample = 224,featurecol=1000;
The first 90 in the scramble order is the 134 of the mine as scrap stone.
int kuangvec[usefulsample], stonevec[uselesssample];
for (int i = 0; i < usefulsample; i++)
Kuangvec[i] = i;
for (int i = 0; I <uselesssample; i++)
Stonevec[i] = usefulsample+i;
Random_shuffle (Kuangvec, kuangvec+usefulsample);
Random_shuffle (Stonevec, Stonevec + uselesssample);
Show fixed ranks
for (int i = 0; i < usefulsample; i++)
cout << Kuangvec[i] << Endl;
cout << Endl;
for (int i = 0; i < uselesssample; i++)
cout << Stonevec[i] << Endl;
Read features
Cvmldata Cvmlprimer;
Cvmlprimer.read_csv ("Features.csv");
Cv::mat cvml = Cv::mat (Cvmlprimer.get_values (), true);
Cvmldata Resprimer;
Resprimer.read_csv ("Labels.csv");
Cv::mat res = Cv::mat (Resprimer.get_values (), true);
Saperate into trainset randomly
Const float rate = 0.6;
int Trainnum=int (usefulsample*rate) +int (uselesssample*rate), testnum=allsample-trainnum;
Mat Traindata=mat::zeros (Trainnum, Featurecol, Cvml.type ()), Testdata=mat::zeros (Testnum, Featurecol, Cvml.type ()), Trainlabel=mat::zeros (Trainnum, 1, Res.type ()), Testlabel=mat::zeros (testnum,1, Res.type ());
for (int i = 0; I <int (usefulsample*rate); i++)
{
float* NewRow = traindata.ptr<float> (i);
int currentrow = Kuangvec[i];
float* Primerow = cvml.ptr<float> (CurrentRow);
for (int j = 0; J < Featurecol; J + +)
NEWROW[J] = Primerow[j];
float* Newlabelrow = trainlabel.ptr<float> (i);
float* Primerlabelrow = res.ptr<float> (CurrentRow);
Newlabelrow[0] = primerlabelrow[0];
}
for (int i = int (usefulsample*rate); I <trainnum; i++)
{
float* NewRow = traindata.ptr<float> (i);
int II = I-int (usefulsample*rate);
int currentrow = Stonevec[ii];
float* Primerow = cvml.ptr<float> (CurrentRow);
for (int j = 0; J < Featurecol; J + +)
NEWROW[J] = Primerow[j];
float* Newlabelrow = trainlabel.ptr<float> (i);
float* Primerlabelrow = res.ptr<float> (CurrentRow);
Newlabelrow[0] = primerlabelrow[0];
}
Saperate into Testset
for (int i = 0; I <usefulsample-int (usefulsample*rate); i++)
{
float* NewRow = testdata.ptr<float> (i);
int III = i + int (usefulsample*rate);
int currentrow = KUANGVEC[III];
float* Primerow = cvml.ptr<float> (CurrentRow);
for (int j = 0; J < Featurecol; J + +)
NEWROW[J] = Primerow[j];
float* Newlabelrow = testlabel.ptr<float> (i);
float* Primerlabelrow = res.ptr<float> (CurrentRow);
Newlabelrow[0] = primerlabelrow[0];
}
for (int i = Usefulsample-int (usefulsample*rate); I <testnum; i++)
{
int II = INT (uselesssample*rate) + I-(Usefulsample-int (usefulsample*rate));
float* NewRow = testdata.ptr<float> (i);
int currentrow = Stonevec[ii];
float* Primerow = cvml.ptr<float> (CurrentRow);
for (int j = 0; J < Featurecol; J + +)
NEWROW[J] = Primerow[j];
float* Newlabelrow = testlabel.ptr<float> (i);
float* Primerlabelrow = res.ptr<float> (CurrentRow);
Newlabelrow[0] = primerlabelrow[0];
}
Trainset and Testset have done!/////////////////////////////////////////////////////////////////
for (int i = 0; i < trainnum; i++)//check the Trainlabel and TestLabel
//{
float* row = trainlabel.ptr<float> (i);
cout <<row[0]<< Endl;
//}
cout << "trainlabel=" <<trainnum << Endl;
for (int i = 0; i < testnum; i++)
//{
float* row = testlabel.ptr<float> (i);
cout << row[0]<<endl;
//}
cout << "testlabel=" << testnum << Endl;
Cvmat traindata2 = traindata, Trainlabel2 = Trainlabel;
cvmat* traindata3 = Cvcreatemat (Traindata2.rows, Traindata2.cols, Traindata2.type);
Cvcopy (&traindata2, traindata3);
cvmat* trainlabel3 = Cvcreatemat (Trainlabel2.rows, Trainlabel2.cols, Trainlabel2.type);
Cvcopy (&trainlabel2, TRAINLABEL3);
cout<< "Ready for Training ... "<<endl;
float priors[1000];
for (int i = 0; i <; i++)
Priors[i] = 1;
Cvboostparams params (cvboost::gentle, 0.95, 1, false, priors);
Cvboost boost;
BOOL update = FALSE;
Const cvmat* VARIDX = 0;
Const cvmat* SAMPLEIDX = 0;
Const cvmat* VarType = 0;
Const cvmat* Missingdatamask = 0;
Boost.train (Traindata3,cv_row_sample, Trainlabel3, Varidx, Sampleidx, VarType, Missingdatamask, params, update);
cout << "Training done!!! Prepare for testing ... "<< endl<<endl;
Begin Test
Cvmat testdata2 = testdata, Testlabel2 = TestLabel;
cvmat* testdata3 = Cvcreatemat (Testdata2.rows, Testdata2.cols, Testdata2.type);
Cvcopy (&testdata2, testdata3);
cvmat* testlabel3 = Cvcreatemat (Testlabel2.rows, Testlabel2.cols, Testlabel2.type);
Cvcopy (&testlabel2, TESTLABEL3);
Const cvmat* missing = 0;
cvmat* weak_responses = 0;
const int numfortest = Testdata.rows;
float outputs;
Outputs=boost.predict (testdata3, Missing, weak_responses, Cv_whole_seq, false);
1. Declare a couple of vectors to save the predictions of each sample
Std::vector train_responses, test_responses;
2. Calculate the training error
float FL1 = Boost.calc_error (&cvml, Cv_train_error, &train_responses);
3. Calculate the test error
float FL2 = Boost.calc_error (&cvml, Cv_test_error, &test_responses);
printf ("Error train%f \ n", FL1);
printf ("Error test%f \ n", FL2);
Save the trained classifier
Boost.save ("./trained_boost.xml", "boost");
return exit_success;
return 0;
}

I added that the data sets and tags were randomly divided into test sets and training sets and changed the previous error results to prove that training was possible but the test went wrong. Ask the company's great God God to tell me predict () the test sample must be a sample originally it can't take a lot of samples like in MATLAB to test to get a tag vector. It can only be tested, so I loop.

This is all right, haha ... The only difference is the calculation accuracy. I used a dynamic array, outputlabels, to store every output label I could forget. Dynamic arrays can avoid the hassle of having to write the size of an ordinary array every time it is defined, but you must remember the delete result. The adaboost is used to classify its own data. I re-tested it with new data:

It's almost identical to the results I made with the adaboost in MATLAB. There is no mistake in proving it. It can be used directly later.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Use AdaBoost to classify your own data

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Use AdaBoost to classify your own data

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support