A command line implementation method for Le Zhang C + + maximum entropy model

Source: Internet
Author: User
Tags readfile

Recently, a project that applied maximum entropy model to movie-review two-yuan affective classification was done.

The maximum entropy model applied is the maximum entropy tool http://homepages.inf.ed.ac.uk/lzhang10/maxent_toolkit.html of Professor Le Zhang.

The analysis data Movie-review application is Bo-pang http://www.cs.cornell.edu/people/pabo/movie-review-data/

As shown in Movie-review, since the Movie-review storage format is not the format required by the maximum entropy model, it is necessary to collate the Movie-review data.


The maximum entropy model requires that the first word in each line represents a category, so you need to convert the TXT file storage format above.

The data should be converted in the following C + + program:

/************************************************ Creator: Hangyuan li ago * creation time: 2014.12.14* Creation Purpose: Provides the format conversion function for the maximum entropy classifier of Le Zhang ************** /#include <string> #include <iostream> #include <fstream> # Include <io.h> #include <set>using namespace std;void ReadFile (set<string>& a,char* dir) {//Read di         All txt filenames in the R folder and store the file name in the Set array//dir represents the file pathname, and a represents the set array name _finddata_t Filedir;         char* dir= "Temp\\*.txt";      Long Lfdir;        if ((Lfdir = _findfirst (dir,&filedir)) ==-1l) printf ("No file is found\n");            else{do{A.insert (filedir.name);        }while (_findnext (lfdir, &filedir) = = 0);    } _findclose (Lfdir); }int Main () {char* filename1= "pos\\*.txt"; all txt files in//pos folder char* filename2= "Neg\\*.txt";//neg folder all txt files set< String> posfile;//is used to store all txt filenames in the POS folder set<string> negfile;//is used to store all txt filenames in neg folder ReadFile (Posfile, FILENAME1); ReadFile (negfile,filename2); Ofstream outfile;//Create a new file output stream Outfile.open ("Result.txt");//output file named Result.txtifstream infile;//Create a new file input stream string sentence;// The string used to convert the file format for (Set<string>::iterator Iiter=posfile.begin (); Iiter!=posfile.end (); iiter++) {Infile.open (" Pos\\ "+*iiter); if (!infile) {cout<<" Can not open file "<<endl;system (" pause "); outfile<<endl<< "POS";  while (true) {infile>>sentence;    Determines whether the end of the file is read, and if the end of the file is read, jumps out of the while () loop if (infile.eof ()) break;            outfile<<sentence<< ""; } infile.close ();} For (Set<string>::iterator Jiter=negfile.begin (); Jiter!=negfile.end (); jiter++) {Infile.open ("neg\\" +*jIter) if (!infile) {cout<< "Can not open file" <<endl;system ("pause"); outfile<<endl<< "Neg", while (true) {infile>>sentence;//Determines whether the end of the file is read, and if the end of the file is read, jumps out of the while () loop if (            Infile.eof ()) break;outfile<<sentence<< ""; }infile.close ();} Outfile.close ();/*for (Set<string>::iterator iiter=posfile.begin (); Iiter!=posfile.end (); Iiter++) {outfile<<*iiter<< "";} Outfile<<endl;for (Set<string>::iterator jiter=negfile.begin (); Jiter!=negfile.end (); jIter++) {outfile <<*jIter<< "";} */system ("pause"); return 0;}

We take 90% of the data as training data and take 10% of the data as test data. The training data is transformed to get a result.txt file that is trained in these formats and then tested with data.

1) Training Command:

Where MaxEnt is the Run command;-M indicates the name of the model for the training output, given by modelname;-I indicates the number of times the training iteration is, and train.txt is the input feature text. This form will not have training information displayed

2) Test:

Outputs the predicted results for each event


Output detailed probability information







A command line implementation method for Le Zhang C + + maximum entropy model

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.