[Elasticsearch] control correlation (vi)-filter,functions and Random_score parameters in Function

[Elasticsearch] control correlation (vi)-filter,functions and Random_score parameters in Function_score queries

Last Update:2014-12-28 Source: Internet

Author: User

Tags session id idf

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This chapter is translated from the Elasticsearch official guide Controlling relevance a chapter.

Ascending based on a subset of filters (boosting Filtered subsets)

Back to the problem that was dealt with in ignoring TF/IDF (ignoring TF/IDF), we needed to calculate their relevance score based on the number of selling points per resort. We want to use the cached filter to influence the score, while Function_score can achieve that goal.

In the current example, we have used a function for all the documents. Now we want to use filters to divide the results into subsets (one selling point corresponds to a filter) and then apply a different function to each subset.

The function we use is named weight, which is similar to the boost parameters that are accepted in the query. The difference is that weight is not normalized to a floating-point number by lucene; it is used as is.

The structure of the query needs to be changed to accommodate multiple functions:

GET/_search{ "query" : { "function_score" : { "filter" : { "term" : { "City" : "Barcelona" }      }, "functions" : [         { "filter" : { "term" : { "features" : "wifi" }}, "weight" :1},        { "filter" : { "term" : { "features" : "garden" }}, "weight" :1},        { "filter" : { "term" : { "features" : "pool" }}, "weight" :2}      ], "score_mode" : "sum" ,     }  }}

The new features that appear in the example above are explained in the following subsections:

Filter vs Query

First, we use filter instead of query in Function_score. In the above example, we do not need to use full-text search. We just want to get all the documents that have Barcelona in the City field, and that logic uses filters to express more appropriately. The _score of all documents obtained by the filter is 1. Function_score will accept a query or a filter. If nothing is specified, then the Match_all query is used by default.

function (Functions)

The functions array is used to specify a series of functions that need to be applied. Each function in the array can also accept an optional filter, and only documents that meet the requirements of the filter will be applied by the function. In the above example, for all matching documents, weight is set to 1 (2 for the pool).

Score_mode

Each function returns a result, and we need some way to _score multiple results into one, and then merge it into the original. The Score_mode parameter specifies the normalization action, which can take the following value:

Multiply: function result is multiplied (default behavior)
Sum: The result of the function is incremented
Avg: Get the average of all function results
Max: Get the maximum function result
Min: Get the minimum function result
First: Use only the result of a function, which can have a filter, or it can have no

In the example above, we want to add the result of each function to get the final score, so we use Score_mode is sum.

Documents that do not match any of the filters retain their original _score, which is 1.

Random score calculation (randomness scoring)

You may wonder what a random score is, or why you should use it. The previous example provides a good use case. The final _score of all the results of this example is 1,2,3,4 or 5. There may be only a few resort hotels with 5 points, but we can assume that there will be many hotels with a score of 2 or 3.

As a website owner, you want to give your advertisers as many opportunities as possible to show their content. With the current query, the return order of the results with the same _score is the same every time. It is better to introduce a degree of randomness to ensure that documents with the same score have the same opportunity to display.

We want each user to see a different random order, but for the same user, when he clicks on the second page, the third page, or the next page, the order they see should be the same. This is called conformance random (consistently random).

The Random_score function, whose output is a number between 0 and 1, can produce a consistent random result when it is given the same seed value, which could be the user's session ID:

GET/_search{ "query" : { "function_score" : { "filter" : { "term" : { "City" : "Barcelona" }      }, "functions" : [        { "filter" : { "term" : { "features" : "wifi" }}, "weight" :1},        { "filter" : { "term" : { "features" : "garden" }}, "weight" :1},        { "filter" : { "term" : { "features" : "pool" }}, "weight" :2},        { "random_score" : { "seed" : "Theusers session ID" }        }      ], "score_mode" : "sum" ,    }  }}

The Random_score clause does not contain any filter, so it applies to all documents.

Of course, if you index a new document that matches the query, the order of the results will change, whether or not you use consistency randomization.

[Elasticsearch] control correlation (vi)-filter,functions and Random_score parameters in Function_score queries

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More