Add the IK Chinese Word divider to the local Solr server for full-text retrieval.

Source: Internet
Author: User
Tags solr

Add the IK Chinese Word divider to the local Solr server for full-text retrieval.

In the previous article, we mentioned schema. the configuration of the <field/> element tag in xml. The tag has four attributes: name, type, indexed, and stored, this article describes how to search Chinese word segmentation by setting the value of the type attribute.

 

First download IK Chinese Word Segmentation project, https://code.google.com/archive/p/ik-analyzer/downloads? Page = 1,

  

Besides the jar package, there are three related configuration files.

  

Step 1: Add IKAnalyzer2012FF_u1.jar to the project's WEB-INF \ lib directory

  

Step 2: Add IKAnalyzer. cfg. xml and stopword. dic to the classes directory of the project.

  

You can manually configure the extended dictionary in the ext. dic file. In the IKAnalyzer. cfg. xml configuration file, we can see the corresponding configuration.

<? Xml version = "1.0" encoding = "UTF-8"?> <! DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd"> <properties> <comment> IK Analyzer Extension Configuration </comment> <! -- You can configure your own extended dictionary here --> <entry key = "ext_dict"> ext. dic; </entry> <! -- You can configure your own extended stopword dictionary here --> <entry key = "ext_stopwords"> stopword. dic; </entry> </properties>

 

Step 3: configure the word segmentation type in the schema. xml file.

  

<fieldType name="text_ik" class="solr.TextField">      <analyzer type="index" isMaxWordLength="false" class="org.wltea.analyzer.lucene.IKAnalyzer"/>      <analyzer type="query" isMaxWordLength="true" class="org.wltea.analyzer.lucene.IKAnalyzer"/>   </fieldType>
After the configuration is complete, start the local service and perform word segmentation test on the Analysis menu page.

 

 
In this way, we can perform word segmentation search by setting the type attribute value to the name value of fieldType when customizing a field.

Record the other two tags used in schema. xml, uniqueKey and solrQueryParser.
  • UniqueKey is used to set the primary key name. The default value is id.
  •  

  • SolrQueryParser is used to set the query conditions and or for fields during word segmentation. The default value is or and is commented out. When it is and, the field must contain the entered keywords for word segmentation, all data can be matched successfully.
  •  

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.