[Lucene3.6.2 entry series] section 14th _ solrj operation index and search documents and integration of Chinese Word Segmentation

Source: Internet
Author: User
Tags solr
Package COM. jadyer. solrj; import Java. util. arraylist; import Java. util. list; import Org. apache. SOLR. client. solrj. solrquery; import Org. apache. SOLR. client. solrj. solrserver; import Org. apache. SOLR. client. solrj. solrserverexception; import Org. apache. SOLR. client. solrj. impl. httpsolrserver; import Org. apache. SOLR. client. solrj. response. queryresponse; import Org. apache. SOLR. common. solrdocument; import Org. apache. SOLR. common. solrdocumentlist; import Org. apache. SOLR. common. solrinputdocument; import COM. jadyer. model. mymessage;/*** [lucene3.6.2 getting started series] section 14th _ solrj operation index and search documents and integration of Chinese word segmentation * @ see syntax * @ see schema. XML makes SOLR and Chinese Word Segmentation integrated * @ see 0) by default, a large number of SOLR-defined fields do not support Chinese word segmentation. If we want to add Chinese word segmentation, so the first thing to add is <types> <fieldtype/> </types> * @ see below the MMSeg4j-1.8.5 for example, describes how SOLR is integrated with the Chinese word divider (its core is schema. configuring fieldtype in XML) * @ see Introduction to mmseg4j, see http://blog.csdn.net/jadyer/article/details/10049525 * @ see 1) Copy mmseg4j-all-1.8.5.jar to D: \ develop \ apache-solr-3.6.2 \ Server \ SOLR \ WEB-INF \ Lib \ Folder * @ see 2) New D: \ develop \ apache-solr-3.6.2 \ home \ DIC \ Folder * @ see 3× copy the dictionary file in the mmseg4j-1.8.5.zip \ data \ folder to D: \ develop \ apache-solr-3.6.2 \ home \ DIC \ * @ see 4) in schame. fieldtype definition for adding Chinese Word Segmentation in XML * @ see common Chinese word segmentation tool package will contain the readme.txt file, which will describe the fieldtype to be configured When SOLR is extended * @ see multiple open, copy the three fieldtypes described in to the schema. line 68th in XML * @ see 5) confirm and modify the dicpath attribute value in the three filetypes to "dic" (here the DIC actually refers to the DIC folder created in step 2) * @ see 6) Finally test the Chinese word segmentation effect * @ see on the SOLR Console (http: // 127.0.0.1: 8088/SOLR/admin /) click [Analysis] * @ see in blue bold, change the field drop-down box to type under field analysis, and enter text_general (schema) in the right text box. fieldtype defined in XML) * @ see followed by field value (INDEX) in the text box on the right, enter the test text 'I'm from Team 4, Changchun Township, Xinglong town, Bayan County, Heilongjiang Province, China' * @ see and click the analyze button below to find the text_general word segmentation, then, modify text_general to the custom textcomplex, and you will see the effect. * @ see can also check the verbose output selection box under field value (index, in this way, all the attributes (Position, offset, type, and so on) of Word Segmentation will be displayed together * @ see variable * @ see 9 jar files used in this example, as shown below * @ see apache-solr-core-3.6.2.jar * @ see apache-solr-solrj-3.6.2.jar * @ see commons-codec-1.6.jar * @ see commons-io-2.1.jar * @ see httpclient-4.1.3.jar * @ see httpcore-4.1.4.jar * @ see httpmime-4.1.3.jar * @ see jcl-over-slf4j-1.6.1.jar * @ see slf4j-api-1.6.1.jar * @ see *@ see role * @ create Aug 7, 2013 11:06:20 pm * @ author Xuan Yu 

The following is the JavaBean class used

Package COM. jadyer. model; import Org. apache. SOLR. client. solrj. beans. field; public class mymessage {@ fieldprivate string ID; @ field ("my_title") Private String title; @ field ("my_content") Private string [] content; /* -- setter and getter of the three attributes -- */Public mymessage () {} public mymessage (string ID, String title, string [] content) {This. id = ID; this. title = title; this. content = content ;}}

The last is a small test written by junit4.x.

package com.jadyer.test;import org.junit.Test;import com.jadyer.solrj.HelloSolrJ;public class HelloSolrJTest {@Testpublic void deleteAllIndex(){HelloSolrJ.INSTANCE.deleteAllIndex();}@Testpublic void addIndexAndSearchFile(){HelloSolrJ.INSTANCE.addIndex();HelloSolrJ.INSTANCE.searchFile();}@Testpublic void addIndexAndSearchFileByBean(){HelloSolrJ.INSTANCE.addIndexByBean();HelloSolrJ.INSTANCE.searchFileByBean();}}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.