Schema. xml configuration and solrj usage

Last Update:2018-12-05 Source: Internet

Author: User

Tags solr

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Schema. xml configuration and solrj usage

This article describes how to build a SOLR runtime environment and perform word segmentation for Chinese Query statements. This article mainly describes the configuration of Schema. xml and how to use solrj.

For a search program, the most important thing is to understand its overall architecture. SOLR is also a Lucene-based full-text search server. At the same time, it is expanded to provide a richer query language than Lucene, and at the same time, it is configurable and scalable, and the query performance is optimized, it also provides a complete functional management interface. however, his execution process is equivalent to Lucene.

A typical component of the search program. The shadow part is completed by Lucene.

Let's first talk about this schema. xml.
Schema. XML, which is equivalent to the data table configuration file, which defines the data type of the data to be indexed. It mainly includes types, fields, and other default settings.

1) first define a fieldtype subnode in the types node, including parameters such as name, class, positionincrementgap. Name is the name of fieldtype, and class points to Org. apache. SOLR. the class name corresponding to the analysis package, used to define this type of behavior. When fieldtype is defined, the most important thing is to define the analyzer used to index and query data of this type, including word segmentation and filtering. The second article details how to add a Chinese Word divider. For more information, see http://3961409.blog.51cto.com/3951409/833417.

2) The next step is to define a specific field (similar to a field in a database) in the fields node, that is, filed. The filed definition includes name and type (for various fieldtypes previously defined ), indexed, stored, multivalued, and so on.
Example:

 
 
  
  <field name="id" type="string" indexed="true" stored="true" required="true" />  
  
  <field name="ant_title" type="textComplex" indexed="true" stored="true" />  
  
  <field name="ant_content" type="textComplex" indexed="true" stored="true" /> 
  
  <field name="all" type="textComplex" indexed="true" stored="false" multiValued="true"/>

Field definition is very important. There are several tips to note that you should set the multivalued attribute to true if there may be multiple value fields to avoid an index creation error. If you do not need to store the corresponding field value, set the stored attribute to false whenever possible.

3) We recommend that you create a copy field to copy all full-text fields to a field for unified search: (in this case, using all: Jason for query is equivalent to using ant_title: Jason.
Or ant_content: Jason)

 
 
  
  <field name="all" type="textComplex" indexed="true" stored="false" multiValued="true"/>

Complete the copy settings at the copy field node:

 
 
  
  <copyField source="ant_title" dest="all"/> 
  
  <copyField source="ant_content" dest="all"/>

4) In addition, you can define dynamic fields. The so-called dynamic field does not need to specify a specific name. As long as you define a field name rule, such as defining a dynamicfield with the name * _ I, define its type as text. When this field is used, any field ending with _ I is considered to comply with this definition, such as name_ I, gender_ I, and school_ I.

The schema. xml configuration file is basically like this. For more details, see SOLR wiki http://wiki.apache.org/solr/schemaxml.

The following operations are performed on the index using solrj:

1) create a project and add the following jar package (refer to the http://wiki.apache.org/solr/Solrj)

From/Dist:

Apache-SOLR-solrj-*. Jar

From/Dist/solrj-lib

Commons-codec-1.3.jar
Commons-httpclient-3.1.jar
Commons-io-1.4.jar
Jcl-over-slf4j-1.5.5.jar
Slf4j-api-1.5.5.jar

That is, the commons-codec-x.xjar in SOLR/Dist/solrj-lib/, commons-httpclient-x.x.jar
Commons-io-x.x.jar
Jcl-over-slf4j-x.x.jar
, Slf4j-api-x.x.jar and SOLR/Dist/medium apache-solr-solrj-x.x.x.jar
Apache-solr-core-x.x.x.jar

2) create a test class

 
 
  
  Package cn.edu. ccut. blackant;
  
   
  
  Import java. Io. ioexception;
  
  Import java.net. malformedurlexception;
  
   
  
  Import org. Apache. SOLR. Client. solrj. solrserverexception;
  
  Import org. Apache. SOLR. Client. solrj. impl. commonshttpsolrserver;
  
  Import org. Apache. SOLR. Common. solrinputdocument;
  
  Import org. JUnit. test;
  
   
  
  Public class solrtest {
  
  
  
  @ Test
  
  Public void test (){
  
  Final string url = "http: // localhost: 8080/SOLR ";
  
  // Create a solrserver object (commonshttpsolrserver)
  
  Try {
  
  Commonshttpsolrserver Server = new commonshttpsolrserver (URL );
  
  
  
  Solrinputdocument Doc = new solrinputdocument ();
  
  Doc. addfield ("ID", "2"); // The ID must exist. The value type depends on the ID type specified in schema. xml.
  
  Doc. addfield ("ant_title", "atitle ");
  
  Doc. addfield ("ant_content", "Jason ");
  
  
  
  Server. Add (DOC );
  
  Server. Commit ();
  
  } Catch (malformedurlexception e ){
  
  // Todo auto-generated Catch Block
  
  E. printstacktrace ();
  
  } Catch (solrserverexception e ){
  
  // Todo auto-generated Catch Block
  
  E. printstacktrace ();
  
  } Catch (ioexception e ){
  
  // Todo auto-generated Catch Block
  
  E. printstacktrace ();
  
  }
  
  }
  
  }

Add JUnit to the project. Right-click the project and choose "add library"> "JUnit"> "junit4"> "finish ".

3) run the test class (you need to view the log files of the console or Tomcat for the running information)

You can use Luke to view the running result. You must select Luke Based on the SOLR version before using it. Here solr3.5 is used, so Luke must also use version 3.5.

Http://code.google.com/p/luke/downloads/detail? Name = lukeall-3.5.0.jar

Usage:

3.1) enter the file path

3.2) Open the software in the command line Java-jar./lukeall-3.5.0.jar

Running interface:

It must be noted that you must specify the SOLR index file path. Here it is/home/Jason/SOLR-Tomcat/SOLR/data/index. After specifying the path

If it runs successfully, a new index will be generated, as shown in the lower right corner. If the id value in the program remains the same, the index value with ID 2 will be overwritten each time, so that you can update the index.

4) Access http: // 127.0.0.1: 8080/SOLR/admin/

Query *: * (query all). If the result contains information in the program, congratulations!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More