SOLR similarity query-morelikethis

Source: Internet
Author: User
Tags solr

Reference:

Morelikethis

Morelikethishandler

There are two ways to implement morelikethis in SOLR:
First, morelikethiscomponent and morelikethis in searchhandler appear as components and are suitable for simple applications.
Second, morelikethishandler and morelikethis are processed as a separate handler, which can be used for filtering and other complex operations.

1. It is best to use termvectors for storage of fields with similar queries. If the field does not use termvectors, morelikethis will generate terms from store storage.

 <field name="cat" ... termVectors="true" />

2. parameter description:

    • MlT. FL: set similar query fields, preferably stored in termvectors.
    • MlT. mintf: Minimum word segmentation frequency. Word Segmentation that is less than this frequency in the source document will be ignored. TF: The frequency of Word Segmentation in this document
    • MlT. mindf: Minimum document frequency. When the number of documents where the word is located is smaller than this value, it is not used for similarity determination. DF: Number of documents where the word is located.
    • MlT. minwl: minimum length of a word. When the length of a word is smaller than this value, it is not used for similarity determination.
    • MlT. maxwl: the maximum length of a word. When the word length is greater than this value, it is not used for similarity determination.
    • MlT. maxqt: Maximum number of terms for constructing similar queries.
    • MlT: True. Similar query is enabled.
    • MlT. Count: returns a specified number of similar documents for each similar query result.
    • MlT. Boost: whether or not the weighted function is enabled for similar queries. True/false
    • MlT. QF: Field field weighting settings for similar queries.

      For example, MLT. QF = text ^ 0.5 features ^ 1.0 name ^ 1.2

Or:

       <str name="mlt.qf">         text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4         title^10.0 description^5.0 keywords^5.0 author^2.0 resourcename^1.0       </str>

 

Method 1: The morelikethis component returns a similar query document for every document in response. This may be called "morelikethese ".

Example:

Http: // localhost: 8983/SOLR/select? Q = Apache & MLT = true & MlT. FL = Manu, cat & MlT. mindf = 1 & MlT. mintf = 1 & FL = ID, score

 

Method 2: You can use morelikethishandler to query similar information.

Morelikethishandler has the following parameters:

Rows

Control the maximum number of returned results

MlT. Match. Include

Whether the result set contains the original document

MlT. Match. offset

By default, the morelikethis query operates on the first result for 'Q'

MlT. interestingterms

Value: "list", "details", "NONE" -- display the related terms used in similar queries. These terms are the most rated terms. If 'details' is selected, the weights of each term are displayed.

See the additional input parameters of morelikethis.

Morelikethishandler can also use contentstream to find similar documents. It extracts related terms from the sent text.

Example:

  <requestHandler name="/mlt" class="solr.MoreLikeThisHandler">    <lst name="defaults">        <str name="mlt.fl">title</str>        <str name="mlt.mintf">1</str>        <str name="mlt.minwl">2</str>        <int name="rows">3</int>    </lst>  </requestHandler>

<Lst name = "defaults"> indicates the default parameter value when querying.

Simple Example:

Http: // localhost: 8983/SOLR/MLT? Q = ID: utf8test & MlT. FL = Manu, cat & MlT. mindf = 1 & MlT. mintf = 1

Http: // localhost: 8983/SOLR/MLT? Q = ID: utf8test & MlT. FL = Manu, cat & MlT. mindf = 1 & MlT. mintf = 1 & MlT. Match. Include = false

Http: // localhost: 8983/SOLR/MLT? Q = ID: sp2514n & MlT. FL = Manu, cat & MlT. mindf = 1 & MlT. mintf = 1 & FQ = instock: True & MlT. interestingterms = Details

Use contentstreams:

If you post text in the body, that will be used for similarity. Alternatively, you can put the posted content in the URL using something like:

Http: // localhost: 8983/SOLR/MLT? Stream. Body = electronics % 20 memory & MlT. FL = Manu, cat & MlT. interestingterms = List & MlT. mintf = 0

IfRemotestreamingIs enabled, you can find your ents similar to the text on a webpage:

Http: // localhost: 8983/SOLR/MLT? Stream. url = http://lucene.apache.org/solr/&mlt.fl=manu,cat&mlt.interestingTerms=list&mlt.mintf=0

 

SOLR similarity query-morelikethis

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.