Use AdaBoost to classify your own data

Source: Internet
Author: User

Want to use adaboost classifier to their own data classification and online a look at a lot of adaboost+harr face detection what what, so can only write their own reference http://blog.csdn.net/zhaocj/article/details/ 50536385 uses the AdaBoost algorithm to classify its own data sets. Features and tags are CSV files: The feature matrix is 224x1000, the category is 2 is the two classification,

int main (int argc, char** argv)
{

Kuangvec[] and stonevec[] Two arrays of image labels in the scrambled order results
const int uselesssample = 134, Usefulsample = N, allsample = 224,featurecol=1000;
The first 90 in the scramble order is the 134 of the mine as scrap stone.
int kuangvec[usefulsample], stonevec[uselesssample];
for (int i = 0; i < usefulsample; i++)
Kuangvec[i] = i;
for (int i = 0; I <uselesssample; i++)
Stonevec[i] = usefulsample+i;
Random_shuffle (Kuangvec, kuangvec+usefulsample);
Random_shuffle (Stonevec, Stonevec + uselesssample);
Show fixed ranks
for (int i = 0; i < usefulsample; i++)
cout << Kuangvec[i] << Endl;
cout << Endl;
for (int i = 0; i < uselesssample; i++)
cout << Stonevec[i] << Endl;
Read features
Cvmldata Cvmlprimer;
Cvmlprimer.read_csv ("Features.csv");
Cv::mat cvml = Cv::mat (Cvmlprimer.get_values (), true);
Cvmldata Resprimer;
Resprimer.read_csv ("Labels.csv");
Cv::mat res = Cv::mat (Resprimer.get_values (), true);
Saperate into trainset
Const float rate = 0.6;
int Trainnum=int (usefulsample*rate) +int (uselesssample*rate), testnum=allsample-trainnum;
Mat Traindata=mat::zeros (Trainnum, Featurecol, Cvml.type ()), Testdata=mat::zeros (Testnum, Featurecol, Cvml.type ()), Trainlabel=mat::zeros (Trainnum, 1, Res.type ()), Testlabel=mat::zeros (testnum,1, Res.type ());
for (int i = 0; I <int (usefulsample*rate); i++)
{
float* NewRow = traindata.ptr<float> (i);
int currentrow = Kuangvec[i];
float* Primerow = cvml.ptr<float> (CurrentRow);
for (int j = 0; J < Featurecol; J + +)
NEWROW[J] = Primerow[j];
float* Newlabelrow = trainlabel.ptr<float> (i);
float* Primerlabelrow = res.ptr<float> (CurrentRow);
Newlabelrow[0] = primerlabelrow[0];
}
for (int i = int (usefulsample*rate); I <trainnum; i++)
{
float* NewRow = traindata.ptr<float> (i);
int II = I-int (usefulsample*rate);
int currentrow = Stonevec[ii];
float* Primerow = cvml.ptr<float> (CurrentRow);
for (int j = 0; J < Featurecol; J + +)
NEWROW[J] = Primerow[j];
float* Newlabelrow = trainlabel.ptr<float> (i);
float* Primerlabelrow = res.ptr<float> (CurrentRow);
Newlabelrow[0] = primerlabelrow[0];
}
Saperate into Testset
for (int i = 0; I <usefulsample-int (usefulsample*rate); i++)
{
float* NewRow = testdata.ptr<float> (i);
int III = i + int (usefulsample*rate);
int currentrow = KUANGVEC[III];
float* Primerow = cvml.ptr<float> (CurrentRow);
for (int j = 0; J < Featurecol; J + +)
NEWROW[J] = Primerow[j];
float* Newlabelrow = testlabel.ptr<float> (i);
float* Primerlabelrow = res.ptr<float> (CurrentRow);
Newlabelrow[0] = primerlabelrow[0];
}
for (int i = Usefulsample-int (usefulsample*rate); I <testnum; i++)
{
int II = INT (uselesssample*rate) + I-(Usefulsample-int (usefulsample*rate));
float* NewRow = testdata.ptr<float> (i);
int currentrow = Stonevec[ii];
float* Primerow = cvml.ptr<float> (CurrentRow);
for (int j = 0; J < Featurecol; J + +)
NEWROW[J] = Primerow[j];
float* Newlabelrow = testlabel.ptr<float> (i);
float* Primerlabelrow = res.ptr<float> (CurrentRow);
Newlabelrow[0] = primerlabelrow[0];
}
Trainset and Testset have done!/////////////////////////////////////////////////////////////////
for (int i = 0; i < trainnum; i++)//check the Trainlabel and TestLabel
//{
float* row = trainlabel.ptr<float> (i);
cout <<row[0]<< Endl;
//}
cout << "trainlabel=" <<trainnum << Endl;
for (int i = 0; i < testnum; i++)
//{
float* row = testlabel.ptr<float> (i);
cout << row[0]<<endl;
//}
Cvmat traindata2 =traindata, Trainlabel2 =trainlabel;
Const cvmat* TRAINDATA3 = Cvcreatemat (Traindata2.rows, Traindata2.cols, Traindata2.type);
Const cvmat* TRAINLABEL3 = Cvcreatemat (Trainlabel2.rows, Trainlabel2.cols, Trainlabel2.type);
printf ("Ready for Training ... ");
Float priors[1000] = {1, 1, 1,};
Cvboostparams params (Cvboost::gentle, ten, 0.95, 1, false, priors);
Cvboost boost;
BOOL update = FALSE;
Const cvmat* VARIDX = 0;
Const cvmat* SAMPLEIDX = 0;
Const cvmat* VarType = 0;
Const cvmat* Missingdatamask = 0;
Boost.train (Traindata3,cv_row_sample, Trainlabel3, Varidx, Sampleidx, VarType, Missingdatamask, params, update);
cout << "Training done!" << Endl;
1. Declare a couple of vectors to save the predictions of each sample
Std::vector train_responses, test_responses;
2. Calculate the training error
float FL1 = Boost.calc_error (&cvml, Cv_train_error, &train_responses);
3. Calculate the test error
float FL2 = Boost.calc_error (&cvml, Cv_test_error, &test_responses);
printf ("Error train%f \ n", FL1);
printf ("Error test%f \ n", FL2);
Save the trained classifier
Boost.save ("./trained_boost.xml", "boost");
return exit_success;
return 0;
}

I was just train, but with adaboost training, I said it was only used for two classification. But I checked my label and it turned out to be a two classification. Training tags are 1 and 22 classes are taken out 60% as the training set two classes remaining 40% as a test set but why did the train report such a mistake?

I finally knew why. From Traindata to cvmat* when the data is not copied just copy the information header so Trian inside is no data ... That's stupid.

Changed to this:

#include "opencv2/core/core.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/ml/ml.hpp"
#include <iostream>
#include <algorithm>
#include <opencv2/opencv.hpp>
using namespace CV;
using namespace Std;
int main (int argc, char** argv)
{
Kuangvec[] and stonevec[] Two arrays of image labels in the scrambled order results
const int uselesssample = 134, Usefulsample = N, allsample = 224,featurecol=1000;
The first 90 in the scramble order is the 134 of the mine as scrap stone.
int kuangvec[usefulsample], stonevec[uselesssample];
for (int i = 0; i < usefulsample; i++)
Kuangvec[i] = i;
for (int i = 0; I <uselesssample; i++)
Stonevec[i] = usefulsample+i;
Random_shuffle (Kuangvec, kuangvec+usefulsample);
Random_shuffle (Stonevec, Stonevec + uselesssample);
Show fixed ranks
for (int i = 0; i < usefulsample; i++)
cout << Kuangvec[i] << Endl;
cout << Endl;
for (int i = 0; i < uselesssample; i++)
cout << Stonevec[i] << Endl;
Read features
Cvmldata Cvmlprimer;
Cvmlprimer.read_csv ("Features.csv");
Cv::mat cvml = Cv::mat (Cvmlprimer.get_values (), true);
Cvmldata Resprimer;
Resprimer.read_csv ("Labels.csv");
Cv::mat res = Cv::mat (Resprimer.get_values (), true);
Saperate into trainset randomly
Const float rate = 0.6;
int Trainnum=int (usefulsample*rate) +int (uselesssample*rate), testnum=allsample-trainnum;
Mat Traindata=mat::zeros (Trainnum, Featurecol, Cvml.type ()), Testdata=mat::zeros (Testnum, Featurecol, Cvml.type ()), Trainlabel=mat::zeros (Trainnum, 1, Res.type ()), Testlabel=mat::zeros (testnum,1, Res.type ());
for (int i = 0; I <int (usefulsample*rate); i++)
{
float* NewRow = traindata.ptr<float> (i);
int currentrow = Kuangvec[i];
float* Primerow = cvml.ptr<float> (CurrentRow);
for (int j = 0; J < Featurecol; J + +)
NEWROW[J] = Primerow[j];
float* Newlabelrow = trainlabel.ptr<float> (i);
float* Primerlabelrow = res.ptr<float> (CurrentRow);
Newlabelrow[0] = primerlabelrow[0];
}
for (int i = int (usefulsample*rate); I <trainnum; i++)
{
float* NewRow = traindata.ptr<float> (i);
int II = I-int (usefulsample*rate);
int currentrow = Stonevec[ii];
float* Primerow = cvml.ptr<float> (CurrentRow);
for (int j = 0; J < Featurecol; J + +)
NEWROW[J] = Primerow[j];
float* Newlabelrow = trainlabel.ptr<float> (i);
float* Primerlabelrow = res.ptr<float> (CurrentRow);
Newlabelrow[0] = primerlabelrow[0];
}
Saperate into Testset
for (int i = 0; I <usefulsample-int (usefulsample*rate); i++)
{
float* NewRow = testdata.ptr<float> (i);
int III = i + int (usefulsample*rate);
int currentrow = KUANGVEC[III];
float* Primerow = cvml.ptr<float> (CurrentRow);
for (int j = 0; J < Featurecol; J + +)
NEWROW[J] = Primerow[j];
float* Newlabelrow = testlabel.ptr<float> (i);
float* Primerlabelrow = res.ptr<float> (CurrentRow);
Newlabelrow[0] = primerlabelrow[0];
}
for (int i = Usefulsample-int (usefulsample*rate); I <testnum; i++)
{
int II = INT (uselesssample*rate) + I-(Usefulsample-int (usefulsample*rate));
float* NewRow = testdata.ptr<float> (i);
int currentrow = Stonevec[ii];
float* Primerow = cvml.ptr<float> (CurrentRow);
for (int j = 0; J < Featurecol; J + +)
NEWROW[J] = Primerow[j];
float* Newlabelrow = testlabel.ptr<float> (i);
float* Primerlabelrow = res.ptr<float> (CurrentRow);
Newlabelrow[0] = primerlabelrow[0];
}
Trainset and Testset have done!/////////////////////////////////////////////////////////////////
for (int i = 0; i < trainnum; i++)//check the Trainlabel and TestLabel
//{
float* row = trainlabel.ptr<float> (i);
cout <<row[0]<< Endl;
//}
cout << "trainlabel=" <<trainnum << Endl;
for (int i = 0; i < testnum; i++)
//{
float* row = testlabel.ptr<float> (i);
cout << row[0]<<endl;
//}
cout << "testlabel=" << testnum << Endl;
Cvmat traindata2 = traindata, Trainlabel2 = Trainlabel;
cvmat* traindata3 = Cvcreatemat (Traindata2.rows, Traindata2.cols, Traindata2.type);
Cvcopy (&traindata2, traindata3);
cvmat* trainlabel3 = Cvcreatemat (Trainlabel2.rows, Trainlabel2.cols, Trainlabel2.type);
Cvcopy (&trainlabel2, TRAINLABEL3);
cout<< "Ready for Training ... "<<endl;
float priors[1000];
for (int i = 0; i <; i++)
Priors[i] = 1;
Cvboostparams params (cvboost::gentle, 0.95, 1, false, priors);
Cvboost boost;
BOOL update = FALSE;
Const cvmat* VARIDX = 0;
Const cvmat* SAMPLEIDX = 0;
Const cvmat* VarType = 0;
Const cvmat* Missingdatamask = 0;
Boost.train (Traindata3,cv_row_sample, Trainlabel3, Varidx, Sampleidx, VarType, Missingdatamask, params, update);
cout << "Training done!!! Prepare for testing ... "<< endl<<endl;
Begin Test
Cvmat testdata2 = testdata, Testlabel2 = TestLabel;
cvmat* testdata3 = Cvcreatemat (Testdata2.rows, Testdata2.cols, Testdata2.type);
Cvcopy (&testdata2, testdata3);
cvmat* testlabel3 = Cvcreatemat (Testlabel2.rows, Testlabel2.cols, Testlabel2.type);
Cvcopy (&testlabel2, TESTLABEL3);
Const cvmat* missing = 0;
cvmat* weak_responses = 0;
const int numfortest = Testdata.rows;
float outputs;
Outputs=boost.predict (testdata3, Missing, weak_responses, Cv_whole_seq, false);
1. Declare a couple of vectors to save the predictions of each sample
Std::vector train_responses, test_responses;
2. Calculate the training error
float FL1 = Boost.calc_error (&cvml, Cv_train_error, &train_responses);
3. Calculate the test error
float FL2 = Boost.calc_error (&cvml, Cv_test_error, &test_responses);
printf ("Error train%f \ n", FL1);
printf ("Error test%f \ n", FL2);
Save the trained classifier
Boost.save ("./trained_boost.xml", "boost");
return exit_success;
return 0;
}

I added that the data sets and tags were randomly divided into test sets and training sets and changed the previous error results to prove that training was possible but the test went wrong. Ask the company's great God God to tell me predict () the test sample must be a sample originally it can't take a lot of samples like in MATLAB to test to get a tag vector. It can only be tested, so I loop.

This is all right, haha ... The only difference is the calculation accuracy. I used a dynamic array, outputlabels, to store every output label I could forget. Dynamic arrays can avoid the hassle of having to write the size of an ordinary array every time it is defined, but you must remember the delete result. The adaboost is used to classify its own data. I re-tested it with new data:

It's almost identical to the results I made with the adaboost in MATLAB. There is no mistake in proving it. It can be used directly later.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.