Sorting and correlation (sorting and relevance)

Source: Internet
Author: User
Tags epoch time

This article is translated from the first section of the sorting and relevance chapter of the Official Elasticsearch guide.

Original address: http://www.elastic.co/guide/en/elasticsearch/guide/current/_sorting.html

Sort

Es by default is to sort the results by relevance, the most relevant document at the top. In this chapter, we explain what we mean by relevance and how it is calculated, but let's start with the sort parameter and how it is used.

To sort by relevance, we need to represent the correlation as a value. In Elasticsearch, we use a floating-point numeric _score to represent the correlation score in the returned query results, so the default sort is _score descending.

Sometimes, you can't get a meaningful relevance score. For example, the following query returns only all tweets with a field user_id value of 1:

GET/_search{    "query": {"filtered": {"filter": {"term                ": {                    "user_id": 1                }            }        }
    
     }}
    
The filter is not related to _score, and the Match_all query with no conditions is set to 1 for all documents _score. In other words, all documents are considered to be of equal relevance.
Sort by field value

In this case, it might make sense to sort by tweets time, and the most recent tweets are at the front. We can do this using the sort parameter:

GET/_search{    "query": {"filtered": {"filter": {"term            ": {"user_id": 1}}        }    ,    "sort": {" Date ": {" order ":" Desc "}}}
In the results, note the two points:
"hits": {    "total":           6,    "Max_score":       
    "Hits": [{        "_index":      "Us",        "_type":       "tweet",        "_id":         "+",        "_score":      
        "_source":     {             "date":    "2014-09-24",             ...        },        "sort":        
    },    ...}

_score is not calculated because it is not used in sorting.

The value of the Date field, expressed as the number of milliseconds starting from the time era, is returned in the sort value.

First, there is a new element in each result: sort, which contains the value of the field we use as the sort. In this example, we sort by date, and date is indexed by the number of milliseconds from the epoch time. This length number 1411516800000 is equivalent to the date string 2014-09-24 00:00:00UTC.

Second, fields _score and Max_score are null. It takes time to calculate _score, and it's usually the only purpose for sorting. We don't sort by relevance, so it doesn't make sense to track _score. If you still want to calculate _score, you can set the Track_scores parameter to True.

TIP: As a shortcut, you can specify the name of the field you want to sort:

 sort: "Number_of_children" 
The
 field names are sorted by default in ascending order, _score by default. 
Multilevel Sort

we may want to combine _score with date to query, when we show all matching results, first sort by date and then _score by relevance.

 GET/_search{"query": {"filtered": {"Query": {"match": {"tweet": "M   Anage text Search "}}," filter ": {" term ": {" user_id ": 2}}}," Sort ": [{" Date ": {"Order": "Desc"}}, {"_score": {"order": "Desc"}}]} 
The order is very important. The results are sorted first by the first criterion. Only if the first sort value of the result is the same, then the second standard is sorted, and so on.
Multilevel sorting does not have to contain the _score field. In the script, you can sort by using several different fields,geo-distance or custom values.

The note:query-string query also supports custom sorting using the sort parameter in the query string:

GET/_search?sort=date:desc&sort=_score&q=search

Multi-valued field sorting
When the sort field has more than one value, keep in mind that these values do not have any internal order; a multivalued field is just a bag of values (the translator notes: All values can be considered as a whole). Which do you choose to order?
For numbers and date types, you can reduce a multivalued field to a value by using a sort mode such as Min,max,avg or sum. For example, you can sort by the earliest date in the collection of Date field values in the following ways:
"Sort": {    "dates": {"        order": "ASC",        "mode":  "Min"    }}

Sorting and correlation (sorting and relevance)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.