Configurations in solrj:
- Two highlighted settings
- // Solrparams. sethighlight (true );
- Solrparams. setparam ("Hl", "true"); // highlighting
- Set the highlighted field:
// Set the highlighted fields. They can be separated by spaces or commas.
Solrparams. addhighlightfield ("title, content"); // Add a highlighted Field
Or:
Solrparams. setparam ("Hl. FL", "title, content ");
- Add HTML code before and after highlighted Fields
Solrparams. sethighlightsimplepre ("<font color = \" Red \ "> ");
Solrparams. sethighlightsimplepost ("</font> ");
- Set the number of highlighted characters
Solrparams. sethighlightfragsize (Params. getviewnums ());
- Result set return:
// The Key of the first map is the Document ID, and the key of the second map is the highlighted field name.
If(Highviews! =Null& Amp; highviews. Length ()> 0 & Params. getishighlight ()){
String [] views = highviews. Split (",");
// Highlight the result set
Map <string, Map <string, list <string> map = response. gethighlighting ();
For(Solrdocument DOC: List ){
For(String view: Views ){
Map <string, list <string> hl_map = map. Get (Doc. getfieldvalue (config.Uniquekey));
If(Map! =Null& Hl_map! =Null& Hl_map.get (View )! =Null& Hl_map.get (view). Size ()> 0 ){
Doc. setfield (view + config.Underline+ Config.Highlt, Hl_map.get (view). Get (0 ));
}Else{
Doc. setfield (view + config.Underline+ Config.Highlt, Stringutils.Substring(Doc. Get (view), Params. getviewnums ()));
}
}
Hllist. Add (DOC );
}
Result. setresults (hllist );
}Else{
Result. setresults (list );
}
In addition, configure solrconfig. xml
SOLR is developed based on Lucene, so it is similar to Lucene in some functions.
SOLR highlight (including automatic summarization) is achieved through hl param and its related variables. HL is short for hightlight. In Lucene, highlight and abstract are operated through highlighter.
Configure the highlighted attributes in solrconfig. xml. Configure in <requesthandler name = "Search" class = "SOLR. searchhandler" default = "true">. The solrconfig. xml file contains multiple requesthandler labels, but the configuration is valid only in the preceding tag named search. The configuration case is as follows:
XML Code
- <Requesthandler name = "Search" class = "SOLR. searchhandler" default = "true">
- <! -- Default values for query parameters can be specified, these
- Will be overridden by parameters in the request
- -->
- <Lst name = "defaults">
- <STR name = "echoparams"> explicit </STR>
- <Int name = "rows"> 10 </int>
- <STR name = "DF"> text </STR>
- <STR name = "Hl"> true </STR>
- <STR name = "Hl. FL"> content </STR>
- <STR name = "F. Name. HL. fragsize"> 50 </STR>
- <STR name = "Hl. Simple. pre"> & lt; font color = & quot; Red & gt; </STR>
- <STR name = "Hl. Simple. Post"> & lt;/font & gt; </STR>
- </Lst>
- </Requesthandler>
Among them, HL specifies whether to use highlight; Hl. FL: Specifies which domains are highlighted. If multiple domains are highlighted, they are separated by commas. f. name. hl. fragsize refers to the length of the abstract. The default value 0 indicates that no abstract is performed. Hl. Simple. Pre and HL. Simple. Post specify the highlighted format. The default value is <em> </em>. For details, see http://wiki.apache.org/solr/highlightingparameters. After the settings are complete,
In solrj, it is obtained through the gethighlighting () method of the queryresponse object. This method returns Map <string, Map <string, list <string> type data. The key in the first map is ID, the key in the second map is field, and the content encapsulated in list <string> is after highlighting and summarization.
Map <string, Map <string, list <string> map = response. gethighlighting ();
You can use this API to obtain the highlighted content. The highlighted content is associated with the document part of the index through the key.
This key is the Document ID.
Parameter Details:
? Hl. FL: List of fields separated by spaces or commas. To enable the highlight function of a field, you must ensure that the field is stored in the schema. If this parameter is not provided, the default field "Standard handler" is highlighted and the DF parameter is used. The dismax field uses the QF parameter. You can use asterisks to easily highlight all fields. If you use wildcards, enable the HL. requiredfieldmatch option.
? Hl. requirefieldmatch:
If this parameter is set to true, It is highlighted unless the query result of this field is not empty. Its default value is false, which means it may match a field but highlight a different field. If hl. Fl uses a wildcard, this parameter is enabled. Even so, if your query is an all field (probably using the copy-Field Command), set it to false, so that the search results can indicate which field of the query text is not found.
? Hl. usephrasehighlighter:
If a query contains a phrase (enclosed in quotation marks), the phrase must be completely matched before being highlighted.
? Hl. highlightmultiterm
If wildcard and fuzzy search are used, the term matching the wildcard is highlighted. The default value is false, and HL. usephrasehighlighter must be true.
? Hl. snippets:
This is the maximum number of highlighted segments. The default value is 1, which is almost unchanged. If the value of a specific field is set to 0 (for example, F. alltext. HL. snippets = 0), this indicates that the field is disabled and highlighted. You may use this when hl. FL =.
? Hl. fragsize:
The maximum number of characters returned by each snippet. The default value is 100. If it is 0, this field will not be fragmented and the value of the entire field will be returned. This is not the case when fields are large.
? Hl. mergecontiguous:
If it is set to true, it will be merge when snippet overlaps.
? Hl. maxanalyzedchars:
The maximum highlighted characters are searched. The default value is 51200. If you want to disable it, set it to-1.
? Hl. alternatefield:
If snippet is not generated (no terms match), another field value is used as the return value.
? Hl. maxalternatefieldlength:
If hl. alternatefield is enabled, you sometimes need to specify the maximum character length of alternatefield. The default value 0 is no limit. So the reasonable value is
? Hl. snippets * hl. fragsize so that the returned results are consistent.
? Hl. formatter: an extension that provides replaceable formatting algorithms. The default value is simple, which is the only option currently. Obviously this is not enough. You can check how the highlighting element is configured in org. Apache. SOLR. Highlight. htmlformatter. Java and solrconfig. xml.
Note that, no matter what value is highlighted in the original text, such as the pre-existing em tags will not be escaped, so sometimes leading to false highlighting.
? Hl. fragmenter:
This is an extension of SOLR's fragment algorithm. GAP is the default value. RegEx is another option, which specifies that the boundary of highlight is determined by a regular expression. This is an atypical advanced option. To know how the default setting and fragmenters (and formatters) are configured, you can look at the highlight section in solrconfig. xml.
The fragmenter of RegEx has the following options:
? Hl. RegEx. Pattern: Pattern of the Regular Expression
? Hl. RegEx. Slop: this is a factor that HL. fragsize can change to adapt to regular expressions. The default value is 0.6, which means that if hl. fragsize = 100, the fragment size ranges from 40 to 160.