Configuration and use of IKAnalyzer

Source: Internet
Author: User

 

I. Configuration

IKAnalyzer Chinese Word divider configuration, simple, super simple.

IKAnalyzer Chinese Word divider download. Pay attention to version issues. It seems that downward incompatibility may occur. Logging On the solr client interface will prompt an error.

The version IK Analyzer 2012FF_hf1 (including the source code and Chinese user manual) is provided. My solr is 4.7, and of course the corresponding Lucene is 4.7. link address:

Http://code.google.com/p/ik-analyzer/downloads/detail? Name=ik31620analyzer%202012ff_hf1.zip & can = 2 & q = Google, but it seems that the download is no longer available, and Google is blocked in China. tmd is really a pain point.

Http://down.51cto.com/data/894638 51CTO above, there is an account with points to support it, I also downloaded from someone else.

Baidu online storage, shared by myself, may not exist after a long time. Link: http://pan.baidu.com/s/1bngYiKZ password: g7dp

The downloaded folder contains at least IKAnalyzer. cfg. xml, IKAnalyzer2012FF_u1.jar, and stopword. dic. You only need to configure these three items.

Copy IKAnalyzer2012FF_u1.jar to the Tomcat installation directory, my C: \ apache-tomcat-8.0.8 \ webapps \ solr \ WEB-INF \ lib \, put IKAnalyzer. cfg. xml, stopword. dic copy to C: \ apache-tomcat-8.0.8 \ webapps \ solr \ WEB-INF \ classes \. If the classes directory does not exist, create it by yourself.

Now the IKAnalyzer Chinese Word divider has been configured. Is it super simple? Do not make a wrong directory.

Ii. Use

For more information, see <types> </types> Add

<! -- IKAnalyzer word Divider -->

<FieldType name = "text_IKFENCHI" class = "solr. TextField">

<Analyzer type = "index" isMaxWordLength = "false" class = "org. wltea. analyzer. lucene. IKAnalyzer"/>

<Analyzer type = "query" isMaxWordLength = "true" class = "org. wltea. analyzer. lucene. IKAnalyzer"/>

</FieldType>

 

Add a node under the <fields> node:

<Field name = "PRODUCTNAME" type = "text_IKFENCHI" indexed = "true" stored = "true"/>

Note that the content in field node type is the fieldType configured above. when defining a class above, the following is equivalent to defining a variable using this class.

In this case, your PRODUCTNAME domain uses the IKAnalyzer tokenizer to perform word segmentation.

Now let's test the splitter on the solr client interface. Restart the Tomcat service and enable http: // localhost: 8080/solr /#/

 

Here we find our core. Here we can configure multiple cores. how to configure the core is actually very simple. I don't know what this core is? Core ?), For example, if you are an e-commerce search system and you can search for products or shops, you can configure two cores to correspond to products and stores respectively, in this way, you can configure different fields in the configuration files of different cores. This is what I understand for the time being. If there is an error, help me correct it. It doesn't matter if you don't understand this. I will discuss a series of blogs later.

The default value is colle1 1. If you have misconfigured the word divider, there may be nothing here. Then, let's see if Logging has error logs, click Analysis and find the PRODUCTNAME You Just configured in the drop-down list.

 

In fact, if you observe carefully, you will find that the drop-down box is classified into Fields and Types. Under the Fields node, you can find text_IKFENCHI under Types, actually, it corresponds to the configured scheme. xml. If you select one of them, you can perform word segmentation test. You can enter a paragraph in the Field Value text box and click Anayse Values to display the word segmentation result.

 

As to what the split result means, you can study it in depth. If I have learned it, I will try again later.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.