1. Documentation of the training
Segmentor_train.txt
File contents, separated by spaces
Chinese import and Export bank and Bank of China to strengthen cooperation Xinhua News agency, Beijing, December 26 (reporter Zhou Genliang) today, the three major indices are small open, followed by the Shanghai and Shenzhen Index in the weight plate group pulled up slightly, but the gem has continued to fall 。 Afternoon weight diving led to the Shanghai and Shenzhen Index also appeared a wave of killing and falling, gem performance is very different, not a wave pulled up, today once plunged 3%. From the face of the plate, today's weight plate still dominate, banks, brokers, real estate rose sharply, but the insurance sector today performance is poor, insurance stocks rose dull. Today, Guo Xin Securities (002736), Western Securities (002673) both trading, Haitong Securities (600837), Guo Yuan Securities (000728), Citic Securities (600030) also have a decent performance. Bank shares, only has been China Citic Bank (601998) trading. Shanghai Composite Index Change
2. Run the class Edu.stanford.nlp.ie.crf.CRFClassifier
Eclipse Run Settings
Parameters of the training model
-prop Chinese_models/edu/stanford/nlp/models/segmenter/chinese/ctb.prop
-serdictionary chinese_models/edu/stanford/nlp/models/segmenter/chinese/dict-chris6.ser.gz
-sighancorporadict chinese_models/edu/stanford/nlp/models/segmenter/chinese/
-trainfile Segmentor_train.txt
-serializeto chinese_models/edu/stanford/nlp/models/segmenter/chinese/newmodel.ser.gz
Parameter description
Prop:ctb.prop, CTB says Chinese Penn Treebank, Pennsylvania Chinese thesaurus
Serdictionary:??
Sighancorporadict:??
Trainfile: Your own training to anticipate documents
Serializeto: Model Storage Location
Requires more than 1g of memory: xmx1g
3. Generated model files in the following directory
Chinese_models/edu/stanford/nlp/models/segmenter/chinese/newmodel.ser.gz
4. Run the word breaker test case
Edu.stanford.nlp.lxf.segmentor/segdemo.java
Training Word Segmentation Model