Text Classification is now relatively mature, a lot of open-source tools, it is recommended that a few more commonly used simple tools: 1, scikit-learn: http://scikit-learn.org/stable/index.html Python programming calls, there are various classification algorithms such as SVM, random forest, Bayesian, and feature extraction, such as Word and Ngram. You can build a classification task with several lines of code. 2, WEKA: http://www.cs.waikato.ac.nz/ml/index.html has a graphical interface, but feel a little slow 3, libsvm: http://www.csie.ntu.edu.tw /~ Cjlin/libsvm/only provides SVM core algorithm, does not have feature extraction, need to write, relatively flexible word segmentation tool: 1, calculation of word segmentation tool: http://ictclas.org/the word segmentation tool, calculation is not maintained for the moment 2. Harbin Institute of Technology's word segmentation tool on the basis of the calculation and increased training data, relatively speaking, word segmentation effect is better than the calculation of 3, ansjsun: http://www.nlpcn.org/demo# inherited from the calculation of Word Segmentation tools, results and call methods are relatively simple. |