In the evaluation of machine learning classification results, the area AOC under the ROC curve is a very important indicator. Here is the call to the Weka class to output the AOC source code:
Try {//1. Read in the data setInstances Data=NewInstances (NewBufferedReader (NewFileReader ("E:\\develop/weka-3-6/data/contact-lenses.arff"))); Data.setclassindex (Data.numattributes ()-1);//2. Training classifier and cross-cross verification method to obtain evaluation object//Note that the method here is different from the validation method we used in the previous sections. Classifier CL=NewNaivebayes (); Evaluation Eval=NewEvaluation (data); Eval.crossvalidatemodel (cl, data,Ten,NewRandom (1)); //3. Generate a Instances object for the ROC surface and AUC valuesintClassindex =0; System. out. println ("The area under the ROC curve:"+Eval.areaunderroc (Classindex));
System.out.println (Eval.toclassdetailsstring ());
System.out.println (Eval.tosummarystring ());
System.out.println (Eval.tomatrixstring ()); } Catch(Exception e) {e.printstacktrace (); }
Then, cross-validation;
If you do not separate the training set and the test set, you can use the cross validation method, the four parameters of the Crossvalidatemodel method in evaluation are, the first is the classifier, the second is the data set evaluated on a dataset, The third parameter is the number of cross-checks (10 is more common), and the fourth is a random number object.
Note: When using Crossvalidatemodel, the classifier does not need to be trained first.
The source code for class Crossvalidatemodel is as follows:
Public voidCrossvalidatemodel (Classifier Classifier, Instances data,intnumfolds, Random random, Object ... forpredictionsprinting) throws Exception {//Make a copy of the data we can reorderdata =NewInstances (data); Data.randomize (random); if(Data.classattribute (). Isnominal ()) {data.stratify (numfolds); } //We assume the first element is a stringbuffer, the second a Range//(Attributes//to output) and the third a Boolean (whether or not to output a//distribution instead//of just a classification) if(Forpredictionsprinting.length >0) { //Print the header firstStringBuffer buff = (stringbuffer) forpredictionsprinting[0]; Range Attstooutput= (Range) forpredictionsprinting[1]; Boolean printdist= ((Boolean) forpredictionsprinting[2]). Booleanvalue (); Printclassificationsheader (data, attstooutput, printdist, Buff); } //Do the folds for(inti =0; i < Numfolds; i++) {Instances train=Data.traincv (Numfolds, I, random); Setpriors (train); Classifier Copiedclassifier=classifier.makecopy (Classifier); Copiedclassifier.buildclassifier (train); Instances Test=DATA.TESTCV (Numfolds, i); Evaluatemodel (copiedclassifier, test, forpredictionsprinting); } m_numfolds=Numfolds; }
Output Result:
Update in ...
Machine Learning: Weka Evaluation class source code analysis and output AUC and cross-validation introduction