(Note: the author and author of the blog post: finallyliuyu source blog)
The purpose of publishing the experiment data is to facilitate users with the same fans to download intermediate data and quickly reproduce the experiment.
Resource Space Provider: download.csdn.net
Statistical dictionary and associated table data structure
Each packet contains four files: keywords.dat,testvsm.dat,trainingvsm.dat,evaluation.txt (The. dat file needs to be viewed using ultraedit)
Select 2000 keywords using the Global DF Method
Use the IG method to select 2000 keywords
Card method selects 2000 keywords
Select 4000 keywords using the local DF Method
Point mutual information method selects 2000 keywords
Use the local DF method to select 1000 Feature Words
Use the global DF method to select 1000 Feature Words
Use the dot mutual information method to select 1000 Feature Words
Use the IG method to select 1000 Feature Words
Card method selects 1000 Feature Words