ElasticSearch (vii)--Request Body Query

Source: Internet
Author: User

Simple Query Lite Search (string query) is a valid command line ad hoc query, but to make good use of the search, the request body search API must be queried using the requestor. This is called because most of the parameters are contained in the JSON format, not the query string.

The request body query can not only process the query, but also highlight the fragments in the result.

1. Empty query

GET _search{}
As with string queries, you can query one, or multiple indexes and types

get/index_2014*/type1,type2/_search{}
You can also use the From, size parameter for paging pagination:

get/website/_search{  "from": 1,  "size": 3}
Note that both the From and size values can be inconsistent with the actual, but only an empty array is returned, and there is no error.

So, this request body query, using the GET request method of carrying content?

The HTTP library of any language (especially JS) does not allow the GET request to carry interactive data, and the user is surprised that the GET request allows interactive data to be carried.

But the truth is, a standard document about the HTTP protocol does not define a GET request carrying request in RFC what happens! So

Es authors tend to use get to submit query requests because they feel that the word can better describe this behavior than post. However, because the GET request that carries the request body is not widely supported, the search API also supports the POST request.

post/website/_search{  "from": 1,  "size": 3}
The request body query allows us to write parameters by using the query DSL (Domian specific Language), compared to the mysterious string query method.

2. Query DSL

The query DSL is a flexible, expressive language, and ES uses DSLs to represent the vast majority of Lucene's capabilities through a simple JSON interface.

You should use this method in your product to query, it is your query more flexible, accurate, easy to read, and easy to debug.

In order to use the query DSL, pass a query to the query parameter:

GET/_search{    "query": Your_query_here}
For example, an empty query, in fact, is equivalent to using a match_all query clause

post/website/_search{  "Query": {    "Match_all": {}}  }
Match_all is a query clause that, like its name, queries all documents.

Structure of query clauses

A typical structure for a query clause:

{    query_name: {        argument:value,        argument:value,...    }}
If it is related to a specific field:

{    query_name: {        field_name: {            argument:value,            argument:value,...        }    }}
For example, you can use the match query clause, where the query has elasticsearch in a field tweet:

To query your own format:

{"    match": {        "tweet": "Elasticsearch"    }}
Request:

GET/_search{    "query": {        "match": {            "tweet": "Elasticsearch"}}    }
Called the query clause clause, which means that it is placed under the query statement.

Merging multiple clauses

A query clause, like a simple building block, can be combined with other clauses to form complex queries.

Clauses can be divided into:

leaf clause leaf clause, used as a comparison between strings and fields.

The compound clause compound clause, and the alternate is to do the merging of the other clauses. For example, a BOOL clause that allows merging of other clauses: must match, must_not,should. It also allows for the inclusion of non-scoring, filters as a structured search:

{"    bool": {"must": {"        match": {     "tweet": "Elasticsearch"}},        "Must_not": {"match": {"name":  "Mary "}},        " should ":   {" Match ": {" tweet ":" Full Text "}}," filter ": {" range ": {" age ": {   " GT ": +}}}
   }}
It is important to note that a compound query clause can contain any other query clauses, or other compound clauses. This means that compound clauses can be nested with each other, allowing complex logical representations.

For example, the following example, a query message, satisfies a message that contains a business opportunity, is also marked by a star, or Folder,indbox, but not a message marked spam.

{"bool": {"must": {"match": {"   Email": "Business Opportunity"}},        "should": [            {"Match":       {" Starred ": True}},            {" bool ": {" must ": {"      match ": {" folder ":" Inbox "}},                " Must_not ":  {" Match ": {" s Pam ": True}}}        ],        " Minimum_should_match ": 1    }}
Don't worry about the details of these examples, we'll explain later. The point is to understand that compound statements can combine multiple clauses, including a leaf clause or this compound clause into a simple query.

3. Querying and Filtering

ES uses a DSL to put a query clause into a simple set, which can be used as two environments: the context of the filter and the query context Filtering .

When used in a filtering environment, query queries are called non-scoring or filtering query, so the query asks the question, ' Does this document match? The answer is two, yes or No.

For example

is the date range of created between 2013-2014?

Does the Status field contain word published?

is the location of the Las_lon field less than 10km away from the target?

When used in a query environment, the query becomes scoring query, which asks "How much does this document match?" "

Typical use of queries:

Find a document that best matches full text search

Contains the word run, or it may be running,runs,jog, sprint

It also includes quick, Brown, fox, and the closer they get, the more relevant the document matches.

Tagged lucene, search, Java, the more the identity word, the higher the relevance of the document.

A scoring query that calculates the relevance of a document to a query and assigns a value to the field _score as a criterion for sorting by relevance. The same concept applies to full-text search.

Attention:

Historically, in Es, query and filtering were done separately, at the beginning of ES2.0, filtering was technically eliminated, and the query began to support non-scoring-style queries.

However, to differentiate and simplify, we still use the word "filter" to describe the non-socring query. You can think of filter, filter query, non-scoring query as the same.

Similarly, if the word query is used separately, we consider it to be a scoring query.

Performance differences

Filter queries are a simple containment with no included checks, which is very fast for them to calculate.

There are various optimizations for which at least one filter query is rarely document-matched and is frequently used as a non-scoring query that can be put into memory for faster retrieval.

In contrast, scoring queries not only need to find matching documents, but also calculate dependencies, which makes them cumbersome for non-scoring queries, and the results of queries cannot be cached.

Fortunately there are inverted indexes that make a simple scoring query that matches only a few documents, performance can be compared to filtering, or even better than filtering, across millions of files.

But in general, filtering is better than querying.

The purpose of filtering is to reduce the number of documents that must be checked by the scoring query.

When do I use it?

As a general rule, query scoring is used in full-text queries, or when relevance scoring is required, and all other times using filter non-scoring query.

4. Important Query Statements

ES has a lot of query statements, only a few are often used, we will be in the follow-up in-depth query a chapter of detailed learning, now quickly introduce some important statements.

Match_all

Match_all query to easily match all documents

{"Match_all": {}}
This query is often used with filters.

Match

The match query is a standard query, whether it is querying a full-text text or an exact value.

If you query a full text field using match, the query string is parsed with the correct parser for that field before executing the query.

{"Match": {"tweet": "About Search"}}
If you use this statement to match exact values, numbers, dates, booleans, and not_analyzed strings on a field,

{"Match": {"age":           }}{"match": {"date":   "2014-09-01"}}{"match": {"public": True         }}{"match": {"T AG ":    " Full_text "  }}
For the exact value of the search, you may want to use the filter statement instead of the query, and we soon see the example of filtering.

The syntax of a match statement query is more secure than a string query.

Multi_match

Multi_match query allows match-like queries on multiple fields

{"    Multi_match": {        "query":    "Full Text Search",        "Fields":   ["title", "Body"]    }}
Range
The range query allows a value or date to be queried in a specified interval, which accepts the following parameters:

Gt:greater than

Gte:greater than or equal to

Lt:less than

Lte:less than or equal to

{"    range": {"Age        ": {            "GTE": +  ,            "LT":    ()}}
Term
Term queries are used as exact value queries, pairs of numeric, date, Boolean, not_analyzed exact value strings

{' term ': {' age ':           }}{' term ': {' date ':   ' 2014-09-01 '}}{' term ': {' public ': true         }}{' term ': {' tag ':    "Full_text"  }}
The term query does not parse the input text, so it supports the exact value query

Terms

Terms queries the same term query, but it allows multiple matching values to be specified, and returns the document if the field contains any of them

{"Terms": {"tag": ["Search", "Full_text", "NoSQL"]}}
exist, missing

exist, missing query is used as a document to query for the existence of a specified field (exist) or a nonexistent document (missing), exist returns the document in which the field exists, missing returns a document that does not exist for that field

{"    exists":   {        "field":    "title"    }}

5. Combination Query

Queries in real-world applications are never simple, using multiple input values to query multiple fields, based on a range of standard filters. To construct a complex query, you need a way to combine multiple query clauses in a single search request.

To meet this requirement, you can use a BOOL query, which accepts the following parameters:

Must: must be a matching document is included in the

Must_not: Must be a mismatched document is included in the

Should: if matched, increase _score, otherwise no effect, score for each document relevance.

Filter: Must match, is the filter mode of non_scoring, simply contains or does not contain.

Because this is the first query statement we see that contains other queries, we need to talk about how relevance scoring is calculated.

Each clause calculates the relevance score for the document, and once the results are computed, the BOOL statement merges the scores together and returns a single fractional value that represents the total score of the bool operation.

The next query, looking for documents: the Title field matches the query string "How to make millions" and is not identified as spam. If the documents are starred, or starting from 2014, they will be ranked higher than other documents.

{"bool": {"must": {"Match": {     "title": "How To Make Millions"}},        "Must_not": {"match": {"tag":   " Spam "}},        " should ": [            {" Match ": {" tag ":" Starred "}},            {" range ": {" date ": {" GTE ":" 2014-01-01 "}}}        ]
   }}
Plus filter Query:

If we don't want the date of the document to have an impact on the score, we can use the filter clause:

{"bool": {"must": {"Match": {     "title": "How To Make Millions"}},        "Must_not": {"match": {"tag": 
    "Spam"}},        "should": [            {"Match": {"tag": "Starred"}}        ],        "filter": {"          range": {"date": {" GTE ":" 2014-01-01 "}}}}    }
By putting a range query into the filter clause, we convert it to a non-scoring query, which no longer affects the relevance score of the document, and because it is a non-scoring query, you can use filter optimizations to improve performance.

Any query can use this method, simply put the query into the BOOL statement in the filter clause, will be automatically converted to non-scoring filter.

If a multi-criteria-based filter is required, the BOOL query itself can be used as a non-scoring query

{"bool": {"must": {"Match": {     "title": "How To Make Millions"}},        "Must_not": {"match": {"tag":   "  Spam "}},        " should ": [            {" Match ": {" tag ":" Starred "}}        ],        " filter ": {          " bool ": {               " must ": [                  { "Range": {"date": {"GTE": "2014-01-01"}}}, {"range": {"price": {                  "LTE": 29.99}}              ],              "Must_not": [
   {"term": {"category": "Ebooks"}}]}}}    
constant_score Query

Although it is not as often used as a bool query, Constant_score queries are still useful, and the query applies static, constant fractions to the matching document. It is primarily used when filtering queries are executed.

You can use this statement instead of a BOOL statement only when you filter clauses. Performance is the same, but facilitates the simplicity and clarity of the query

{"Constant_score": {"filter": {"term   ":            {"category": "Ebooks"    }}}}
6. Verify the query

Queries can be very complex, especially when different parser and field mappings are combined, and the Validate-query API can check if a request is valid.

Add/_validate/query after the request URL

get/gb/tweet/_validate/query{   "Query": {"      tweet": {         "match": "Really Powerful"}}   }
The response from the validate request tells us that the request is invalid:

{  "valid":         false,  "_shards": {    "total":       1,    "successful":  1,    "failed":      0  }}
If you want to know where the problem is, you can add the parameter explain

Get/gb/tweet/_validate/query?explain {"   query": {"      tweet": {         "match": "Really Powerful"      }   }}
Obviously, we confuse the category of the query statement with the name of the field

{  "valid":     false,  "_shards":   {...},  "explanations": [{    "index":   "GB",    " Valid ":   false,    " error ":   " org.elasticsearch.index.query.QueryParsingException:                 [GB] No query Registered for [tweet] "  }"}
We can also help the Expalin parameter understand how ES interprets queries:

get/us,gb/_validate/query?explain{  "Query": {    "match": {      "tweet": "Really Powerful"}}  }
Returns a explanation for each index we query, because each index has a different mapping and parser:

{  "valid": True,  "_shards": {    "total": 2,    "successful": 2,    "failed": 0  },  " Explanations ": [    {"      index ":" GB ",      " valid ": True,      " explanation ":" Tweet:realli tweet:power "    },    {      "index": "Us",      "valid": True,      "explanation": "Tweet:really tweet:powerful"    }  ]}
From explanation, we can see how the match statement really the query string powerful to two single words for a tweet field.

The rewrite term for two indexes is different because the tweet field in index GB uses the English parser.

ElasticSearch (vii)--Request Body Query

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.