Note: Based on lucene5.5.x version One, simple introduction of IK Analyzer
IK Analyzer is linliangyi2007 's work, and then thanks for his blog address:http://linliangyi2007.iteye.com/
IK Analyzer supports two participle, one is the most fine-grained participle (recommended use, IK default to use the most fine-grained), there is a kind of intelligent participle (testing the intelligent word is not lucene with the word segmentation accurate, hehe). ii. IK Analyzer compatibility Problem Solving method
Ikanalyzer current version only support to lucene4.x, solr4.x, so we need to modify the Ikanalyzer source code, let it support the lucene5.5 version.
The compatible lucene5.x version of IK Analyzer, as modified by me, is provided here: http://download.csdn.net/detail/eguid_1/9576005
Note: Based on the lucene5.5.2 version, using the jdk1.7 environment, lucene6.x Please use the jdk1.8,lucene5.5.x API with a few minor changes to the previous version.
third, why use the Chinese analyzer
Then return to the title, why to use Chinese word breaker, the reason is Lucene's own analyzer StandardAnalyzer although support Chinese, but the segmentation is not fine, for some obvious Chinese words do not have participle.
Iv. How to use the Chinese analyzer
I took the analyzer analyzer alone and handled it independently (there are a lot of benefits to it that I can easily expand the use of the new word breaker)
The other source code is completely unchanged, only need to change the analyzerserv.
The IK word breaker has three related profiles by default:
Ext.dic (extended thesaurus);
IKAnalyzer.cfg.xml (extended Word library and stop Word Library configuration);
Stopword.dic (stop word)
(1) When indexing is used:
False-the most fine granularity participle; true-intelligent participle
Analyzer analyzer=new Ikanalyzer (false);
Indexwriterconfig = new Indexwriterconfig (analyzer);
(2) Use when searching:
False-the most fine granularity participle; true-intelligent participle
Analyzer analyzer=new Ikanalyzer (false);
QueryBuilder parser = new QueryBuilder (analyzer);