SOLR Integrated ANSJ Chinese word breaker

Source: Internet
Author: User
Tags solr

  

ANSJ use and related information download reference: http://iamyida.iteye.com/blog/2220833

  

Refer to http://www.cnblogs.com/luxh/p/5016894.html configuration and SOLR and Tomcat's

1, from http://iamyida.iteye.com/blog/2220833 download good ANSJ need relevant information, the following is downloaded.

ANSJ Information: Http://pan.baidu.com/s/1kTLGp7L

2. Copy ANSJ related files to the SOLR project

1) put Ansj_seg-2.0.8.jar, Nlp-lang-0.2.jar and Solr-analyzer-ansj-5.1.0.jar in the SOLR project

Drop directory:/luxh/solr/apache-tomcat-8.0.29/webapps/solr/web-inf/lib

2) Place the library.properties, libary, and Stopwords directories in the SOLR project

Drop directory:

pwd/luxh/solr/apache-tomcat-8.0. /webapps/solr/web-inf/lslibrary  library.properties  log4j.properties  

3) configuration Library.properties

Configure it according to your actual path.

VIfile  pathambiguitylibrary=/luxh/solr/apache-tomcat-8.0. /webapps/solr/web-inf/classes/library/Ambiguity.dic#path of userlibrary this isdefault Libraryuserlibrary=/luxh/solr/apache-tomcat-8.0. /webapps/solr/web-inf/classes/library#set real Nameisrealname=true

3, under the Solr_home to establish a collection

1) Create a collection called Collection1

pwd/luxh/solr/mkdir collection1

2) Copy the contents of the/solr-5.3.1/server/solr/configsets/basic_configs to the new Collection1

pwd/luxh/solr/solr-5.3. 1/server/solr/configsets/CP -R. /* /luxh/solr/solr_home/collection1/

4, configure the Schema.xml in Collection1, add the ANSJ participle configuration

pwd/luxh/solr/solr_home/collection1/lscurrency.xml  lang  protwords.txt  _rest_managed.json  schema.xml  solrconfig.xml  stopwords.txt  VI

Add the following content:

<fieldtype name="TEXT_ANSJ"class="SOLR. TextField"> <analyzer type="Index"> <tokenizer class="org.apache.lucene.analysis.ansj.AnsjTokenizerFactory"Query="false"pstemming="true"Stopwordsdir="Stopwords/stopwords.dic"/> </analyzer> <analyzer type="Query"> <tokenizer class="org.apache.lucene.analysis.ansj.AnsjTokenizerFactory"Query="true"pstemming="false"/> </analyzer> </fieldType>

5. Start Tomcat

[Email protected] apache-tomcat-8.0. ] # Bin/startup. SH

6. Your ip:8080/solr/admin.html Add Core via HTTP///

Instancedir point to the collection1 you just created

7. Testing

1) English

2) Chinese

SOLR Integrated ANSJ Chinese word breaker

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.