Create the IKAnalyzer.cfg.xml file under the SRC directory of the Web project, as follows
<?XML version= "1.0" encoding= "UTF-8"?><!DOCTYPE Properties SYSTEM "Http://java.sun.com/dtd/properties.dtd"> <Properties> <Comment>IK Analyzer Extended Configuration</Comment> <!--The user can configure their own extension dictionary here - <entryKey= "Ext_dict">Use.dic.dic;googlepy.dic</entry> <!--The user can configure their own extension stop word dictionary here - <entryKey= "Ext_stopwords">Dicdata/ext_stopword.dic</entry> </Properties>
Attention:
1. Use.dic format is a non-BOM UTF-8 encoded Chinese text file with unlimited file extensions. In the dictionary, each Chinese word has a separate line, using \ r \ n dos mode to wrap. (Note, if you don't know what the UTF-8 format is without BOM, make sure your dictionary uses UTF-8 storage and add a blank line to the head of the file). You can refer to the. dic file under the source Org.wltea.analyzer.dic package of the word breaker.
2. The use.dic file should be deployed in SRC. (It is recommended to put together with IKAnalyzer.cfg.xml).
3.ikanalyzer.cfg.xml in the path: the front cannot be added/, otherwise the absolute path.
Http://www.cnblogs.com/dennisit/archive/2013/04/07/3005847.html
Http://blog.sina.com.cn/s/blog_4c9d7da201013wv2.html
Http://www.itzhai.com/ikanalyzer-lucene-demo-performance-test.html#read-more
Lucene dictionary augmentation based on Ikanalyzer configuration