Most people have seen the autocomplete function (see). SOLR provides a mechanism to build this function. Today, I will show you how to use facet to add an Automatic completion mechanism.
Index
Imagine you want to give users some tips in your online store, such as the product name. Assume that our index construction is as follows:
<field name="id" type="string" indexed="true" stored="true" multiValued="false" required="true"/>
<field name="name" type="text" indexed="true" stored="true" multiValued="false" />
<field name="description" type="text" indexed="true" stored="true" multiValued="false" />
The text type is defined as follows:
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
Configuration
Before you begin, consider whether you want to implement a name prompt or a full name prompt. This all depends on our selection. We must set the appropriate domain for the places to be guided.
Word prompt
In the case of words, the field we use is also a token. In this case, the domain name is enough. However, this is a stem, and all analysis operations are on the stem. Therefore, we 'd better change to another type.
Prompt for full name
We use a different domain configuration to define the full name prompt-the best undefined domain. However, we cannot use fields similar to the string type. For this reason, we define the following fields:
<field name="name_auto" type="text_auto" indexed="true" stored="true" multiValued="false" />
The text_auto type is defined as follows:
<fieldType name="text_auto" class="solr.TextField">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
To avoid affecting the format of the original data, copy the original data:
<copyField source="name" dest="name_auto" />
How to Use
To use this data, we have prepared a simple query statement:
q=*:*&facet=true&facet.field=FIELD&facet.mincount=1&facet.prefix=USER_QUERY
To be replaced:
Field: We plan to provide the suggested domain. In this example, the domain name is name or name_auto.
User_query: User-input characters
Here, you can set rows = 0, so that only the results of facet can be returned without the query results. Of course this is not necessary.
An example of a query can be written as follows:
fl=id,name&rows=0&q=*:*&facet=true&facet.field=name_auto&facet.mincount=1&facet.prefix=har
The query results will return the following results:
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
</lst>
<result name="response" numFound="4" start="0"/>
<lst name="facet_counts">
<lst name="facet_queries"/>
<lst name="facet_fields">
<lst name="name_auto">
<int name="hard disk">1</int>
<int name="hard disk samsung">1</int>
<int name="hard disk seagate">1</int>
<int name="hard disk toshiba">1</int>
</lst>
</lst>
<lst name="facet_dates"/></lst>
</response>
Extended Functions
Here we will talk about some of his common functions.
The first is to display some additional information about the user, such as the number of results displayed when you select a prompt word. This is an interesting feature.
The other is to use the facet. Sort parameter for sorting. This depends on your requirements. We can sort documents by document quantity (by default, set the parameter to true) or alphabetically (set to false ).
We can also set facet. mincount to display more prompt words than the specified number.
Another good feature is that the prompt word can be obtained not only by the user type, but also by other attributes, which is similar to a category. For example, if we want to show users the products related to household products, we assume that users are not interested in the DVD products, so we add a parameter: FQ = Department: homeapplications (assuming this department is available ). Through such a query, you do not need to match all the indexes, but choose from the department we selected.
End
Like other methods, it has advantages and disadvantages. The advantage is that it is easy to use, has no additional component dependencies, and can constrain the results to a very small scope to better match the user's needs; another major advantage is that it carries the result statistics for each prompt word. The disadvantage is that additional types and fields need to be added. In addition, because of its facet mechanism, the machine performance and load are very high.
PS: I tested it myself, because this function is a real-time request (each letter is entered as a request), if the amount is large, the statistical amount will occupy a lot of memory, and the memory is too small (my 2 GB) it is easy to oom. Therefore, this function is used with caution.
Facet. prefix is recommended by a buddy on the Internet. Because there is no strong demand in this field, you can start from here when necessary.
Original article: http://java.dzone.com/news/solr-and-autocomplete-part-1