3 SOLR configuration file Schema.xml

Last Update:2015-06-15 Source: Internet

Author: User

Tags solr

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1 Adding your own word breaker (mmseg4j)

Meaning Textcommplex This type, with the com.chenlb.mmseg4j.solr.MMSegTokenizerFactory This word breaker, thesaurus is used in the Solr.home directory below the DIC directory, But Mmseg4j.jar 1.9 put in the thesaurus, want to use the outside, need to remove the inside, <filter class= "SOLR. Lowercasefilterfactory "/> To add some of their own filters under the optional

     <FieldTypename= "Textcomplex"class= "SOLR." TextField "Positionincrementgap= "+" >              <Analyzer>                 <Tokenizerclass= "Com.chenlb.mmseg4j.solr.MMSegTokenizerFactory"Mode= "complex"Dicpath= "dic"/>                 <Filterclass= "SOLR." Lowercasefilterfactory "/>             </Analyzer>       </FieldType>       <FieldTypename= "Textmaxword"class= "SOLR." TextField "Positionincrementgap= "+" >          <Analyzer>              <Tokenizerclass= "Com.chenlb.mmseg4j.solr.MMSegTokenizerFactory"Mode= "Max-word"Dicpath= "dic"/>              <Filterclass= "SOLR." Lowercasefilterfactory "/>          </Analyzer>       </FieldType>    <FieldTypename= "Textsimple"class= "SOLR." TextField "Positionincrementgap= "+" >         <Analyzer>             <Tokenizerclass= "Com.chenlb.mmseg4j.solr.MMSegTokenizerFactory"Mode= "simple"Dicpath= "dic"/>             <Filterclass= "SOLR." Lowercasefilterfactory "/>         </Analyzer>       </FieldType>

2 Add your own fields

Name: Field Name Type: field type indexed: Index stored: Whether to store multivalued: whether it is a multi-value

Not_analyzed_not_norms	YES	Identifier (primary key, file name), phone number, social Security number, name, date
Anaylzed	YES	Document title and summary
Anaylzed	NO	Document body
NO	YES	Document type, database primary key (not indexed)
Not_analyzed	NO	hiding keywords

field.store.*YES: The domain value will be stored, the value of the original string will be saved in the index, so that the corresponding recovery operation, for the primary key, the title can be stored in this way no: The domain value is not stored, Usually used in combination with index.anaylized, index some documents such as the body of the article that do not need to be recovered using Field.index. * to operate index.analyzed: Word segmentation and indexing, applicable to the title, content, etc. index.not_analyzed: index, but do not do word segmentation, if the Social security number, name, ID, etc., for the exact search Index.analyzed_ Not_norms: The word segmentation but does not store norms information, this norms includes the time and weight of the creation of the index and other information index.not_analyzed_not_norms: that is, do not do participle or store norms information index.no: Do not index

<Fieldname= "Msg_title"type= "Textcomplex"indexed= "true"stored= "true"multivalued= "false" /> 
<Fieldname= "Msg_content"type= "Textcomplex"indexed= "true"stored= "false"multivalued= "false" /> <Fieldname= "Msg_text"type= "Textcomplex"indexed= "true"stored= "false"multivalued= "true" />

3 merge fields

Copy the msg_title he msg_content to Msg_text, which is the multivalued field above field msg_text must be true

<source= "Msg_title"  dest= "Msg_text"/><    Source = "Msg_content" dest = "Msg_text"/>

4 Setting the default search field

change the comment to open in Schema.xml, but it does not take effect because un-commenting Defaultsearchfield will is insufficient if your request handler in so Lrconfig.xml defines "DF", which takes precedence. That's would need to be removed.<Defaultsearchfield>Text</Defaultsearchfield>This configuration also has a higher priority in solrconfig.xml, so for this to take effect, you must<Strname= "DF">Text</Str>this is deleted.<LSTname= "Defaults"><Strname= "Echoparams">Explicit</Str><intname= "Rows">10</int><Strname= "DF">Text</Str></LST>

5 Filter

1 Disable Word filter, that is, which words are ignored, refer to Stopwords.txt (eg:a An and is as at being but)<Filterclass= "SOLR." Stopfilterfactory "IgnoreCase= "true"words= "Stopwords.txt" />2 synonym filter, which word is a meaning, refer to synonyms.txt (Eg:pixima = pixma)<Filterclass= "SOLR." Synonymfilterfactory "Synonyms= "Synonyms.txt"IgnoreCase= "true"Expand= "true"/>3 Turn lowercase filter<Filterclass= "SOLR." Lowercasefilterfactory "/>

6 Dynamic Fields

The consciousness field name ends with _i, and in the case where the name does not match, it matches the dynamic field, which is the int type (Eg:xxoo_i cannot find <name= "Xxoo" > of the situation) <  name= "*_i"  type= "int"    indexed= "true "  stored=" true "/>

3 SOLR configuration file Schema.xml

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More