1 Adding your own word breaker (mmseg4j)
Meaning Textcommplex This type, with the com.chenlb.mmseg4j.solr.MMSegTokenizerFactory This word breaker, thesaurus is used in the Solr.home directory below the DIC directory, But Mmseg4j.jar 1.9 put in the thesaurus, want to use the outside, need to remove the inside, <filter class= "SOLR. Lowercasefilterfactory "/> To add some of their own filters under the optional
<FieldTypename= "Textcomplex"class= "SOLR." TextField "Positionincrementgap= "+" > <Analyzer> <Tokenizerclass= "Com.chenlb.mmseg4j.solr.MMSegTokenizerFactory"Mode= "complex"Dicpath= "dic"/> <Filterclass= "SOLR." Lowercasefilterfactory "/> </Analyzer> </FieldType> <FieldTypename= "Textmaxword"class= "SOLR." TextField "Positionincrementgap= "+" > <Analyzer> <Tokenizerclass= "Com.chenlb.mmseg4j.solr.MMSegTokenizerFactory"Mode= "Max-word"Dicpath= "dic"/> <Filterclass= "SOLR." Lowercasefilterfactory "/> </Analyzer> </FieldType> <FieldTypename= "Textsimple"class= "SOLR." TextField "Positionincrementgap= "+" > <Analyzer> <Tokenizerclass= "Com.chenlb.mmseg4j.solr.MMSegTokenizerFactory"Mode= "simple"Dicpath= "dic"/> <Filterclass= "SOLR." Lowercasefilterfactory "/> </Analyzer> </FieldType>
2 Add your own fields
Name: Field Name Type: field type indexed: Index stored: Whether to store multivalued: whether it is a multi-value
Not_analyzed_not_norms |
YES |
Identifier (primary key, file name), phone number, social Security number, name, date |
Anaylzed |
YES |
Document title and summary |
Anaylzed |
NO |
Document body |
NO |
YES |
Document type, database primary key (not indexed) |
Not_analyzed |
NO |
hiding keywords |
field.store.*YES: The domain value will be stored, the value of the original string will be saved in the index, so that the corresponding recovery operation, for the primary key, the title can be stored in this way no: The domain value is not stored, Usually used in combination with index.anaylized, index some documents such as the body of the article that do not need to be recovered using Field.index. * to operate index.analyzed: Word segmentation and indexing, applicable to the title, content, etc. index.not_analyzed: index, but do not do word segmentation, if the Social security number, name, ID, etc., for the exact search Index.analyzed_ Not_norms: The word segmentation but does not store norms information, this norms includes the time and weight of the creation of the index and other information index.not_analyzed_not_norms: that is, do not do participle or store norms information index.no: Do not index
<Fieldname= "Msg_title"type= "Textcomplex"indexed= "true"stored= "true"multivalued= "false" />
<Fieldname= "Msg_content"type= "Textcomplex"indexed= "true"stored= "false"multivalued= "false" /> <Fieldname= "Msg_text"type= "Textcomplex"indexed= "true"stored= "false"multivalued= "true" />
3 merge fields
Copy the msg_title he msg_content to Msg_text, which is the multivalued field above field msg_text must be true
<source= "Msg_title" dest= "Msg_text"/>< Source = "Msg_content" dest = "Msg_text"/>
4 Setting the default search field
change the comment to open in Schema.xml, but it does not take effect because un-commenting Defaultsearchfield will is insufficient if your request handler in so Lrconfig.xml defines "DF", which takes precedence. That's would need to be removed.<Defaultsearchfield>Text</Defaultsearchfield>This configuration also has a higher priority in solrconfig.xml, so for this to take effect, you must<Strname= "DF">Text</Str>this is deleted.<LSTname= "Defaults"><Strname= "Echoparams">Explicit</Str><intname= "Rows">10</int><Strname= "DF">Text</Str></LST>
5 Filter
1 Disable Word filter, that is, which words are ignored, refer to Stopwords.txt (eg:a An and is as at being but)<Filterclass= "SOLR." Stopfilterfactory "IgnoreCase= "true"words= "Stopwords.txt" />2 synonym filter, which word is a meaning, refer to synonyms.txt (Eg:pixima = pixma)<Filterclass= "SOLR." Synonymfilterfactory "Synonyms= "Synonyms.txt"IgnoreCase= "true"Expand= "true"/>3 Turn lowercase filter<Filterclass= "SOLR." Lowercasefilterfactory "/>
6 Dynamic Fields
The consciousness field name ends with _i, and in the case where the name does not match, it matches the dynamic field, which is the int type (Eg:xxoo_i cannot find <name= "Xxoo" > of the situation) < name= "*_i" type= "int" indexed= "true " stored=" true "/>
3 SOLR configuration file Schema.xml