Invoke the Weka simulation to implement the "active learning" algorithm

Source: Internet
Author: User
Tags stub

Active Learning:

The process of active learning: requires the classifier to interact with the tagging expert. A typical process:

(1) Building a model based on a small number of labeled samples

(2) A sample of the largest amount of information ever selected from an unmarked sample to be labeled by an expert

(3) Merging these samples with previous samples and building models

(4) Repeat steps (2) and step (3) until stopping criterion (no unlabeled samples or other conditions exist)

Simulation ideas:

1. Dividing data into label and Unlabel datasets

2. Divide the Unlabel into 100 groups, each set of sample array to calculate the entropy value, according to the entropy value, take the first 5 samples, added to the label sample

 Packagedemo;ImportJava.io.FileReader;Importjava.util.ArrayList;Importjava.util.Collections;ImportJava.util.Random;Importweka.classifiers.Evaluation;ImportWeka.classifiers.bayes.NaiveBayes;Importweka.core.Instance;Importweka.core.Instances;ImportWeka.core.converters.ConverterUtils.DataSource;//sort the test cases according to the entropy valuesclassInstancesortImplementsComparable<instancesort>{     PublicInstance Instance;  Public Doubleentropy;  PublicInstancesort (Instance Instance,Doubleentropy) {         This. Instance =instance;  This. Entropy =entropy; } @Override Public intcompareTo (Instancesort o) {//TODO auto-generated Method Stub        if( This. entropy <o.entropy) {            return1; }Else if( This. entropy >o.entropy) {            return-1; }                return0; }} Public classactivelearning { Public StaticInstances getinstances (String fileName)throwsexception{Instances Data=NewInstances (NewFileReader (fileName)); Data.setclassindex (Data.numattributes ()-1); returndata; }        //Calculate Entropy     Public Static DoubleComputeentropy (Doublepredictvalue) {        DoubleEntropy = 0.0; if(1-predictvalue < 0.000000001d | | Predictvalue < 0.000000001D){            return0; }Else {            return-predictvalue* (Math.log (predictvalue)/math.log (2.0d))-(1-predictvalue) * (Math.log (1-predictvalue)/math.log (2.0d)); }    }         Public Static voidClassify (Instances train, Instances test)throwsexception{naivebayes Classifier=NewNaivebayes (); //Training ModelClassifier.buildclassifier (train); //Evaluation ModelEvaluation eval =NewEvaluation (test);        Eval.evaluatemodel (classifier, test);    System.out.println (Eval.toclassdetailsstring ()); }        //not sure sampling     Public StaticInstances uncertaintysample (Instances labeled, Instances unlabeled,intStartintEndthrowsexception{//using a labeled first training modelNaivebayes classifier =NewNaivebayes ();        Classifier.buildclassifier (labeled); //Sort by EntropyArrayList <InstanceSort> L =NewArraylist<instancesort>();  for(inti = start; I < end; i++) {            Doubleresult =classifier.classifyinstance (Unlabeled.instance (i)); DoubleEntropy =computeentropy (Result); Instancesort is=NewInstancesort (Unlabeled.instance (i), entropy);        L.add (IS); }        //sort by the entropy valueCollections.sort (L); DataSource Source=NewDataSource ("Nasa//pc1.arff"); Instances A=Source.getdataset (); Instances choseninstances=NewInstances (A, 0); //Select 5 instances with minimum entropy value per 100         for(inti = 0; I < 5; i++) {Choseninstances.add (L.get (i). instance); }                returnchoseninstances; }        //sampling     Public Static voidSample (Instances Instances, Instances test)throwsexception{Random Rand=NewRandom (1023);        Instances.randomize (RAND); Instances.stratify (10); Instances unlabeled= INSTANCES.TRAINCV (10, 0); Instances labeled= INSTANCES.TESTCV (10, 0); intiterations = unlabeled.numinstances ()/100 +1;  for(inti=0; i< iterations-1; i++){            //Select 5 instances with minimum entropy value per 100//100 a groupInstances resultinstances = Uncertaintysample (labeled, unlabeled, i*100, (i+1) *100);  for(intj = 0; J < Resultinstances.numinstances (); J + +) {Labeled.add (Resultinstances.instance (j));        } classify (labeled, test); } Instances resultinstances= Uncertaintysample (labeled, unlabeled, (iterations-1) *100, Unlabeled.numinstances ());  for(intj = 0; J < Resultinstances.numinstances (); J + +) {Labeled.add (Resultinstances.instance (j));            } classify (labeled, test); }         Public Static voidMain (string[] args)throwsexception{//TODO auto-generated Method StubInstances Instances = getinstances ("Nasa//pc1.arff"); //10-fold Cross ValidationRandom Rand =NewRandom (1023);        Instances.randomize (RAND); Instances.stratify (10); Instances Train= INSTANCES.TRAINCV (10, 0); Instances Test= INSTANCES.TESTCV (10, 0);//System.out.println (Train.numinstances ());//System.out.println (Test.numinstances ());sample (Train,test); }}

Invoke the Weka simulation to implement the "active learning" algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.