[Elasticsearch] Partial match (ii)-wildcard character and regular expression query

Source: Internet
Author: User

Wildcard wildcards regular-expression query

Wildcard queries are similar to prefix queries and are a low-level query based on terms. But it allows you to specify a pattern rather than a prefix (Prefix). It uses the standard shell wildcard character:? to match any character, * to match 0 or more characters.

The following query can match documents that contain W1F 7HW and w2f 8HW:

get/my_index/address/_search{  " query " : {"  Wildcard  : {"  postcode " : "  W? F*hw }} "

Assume now that you want to match all zip codes in the W Region (area). When you use a prefix match, the zip code that begins with the WC is also matched, and you encounter a similar problem when you use a wildcard query. We just want to match the ZIP code that starts with W and follows the number. Using the RegExp query allows you to write more complex patterns:

get/my_index/address/_search{  " query " : {"  regexp< Span class= "Pl-pds" style= "" > " : {"  postcode : "  w[0-9].+ " }}}

This regular expression specifies that the entry needs to begin with W, followed by a number from 0 to 9, followed by one or more other characters.

Wildcard and RegExp queries work exactly the same way as prefix queries. They also need to traverse the list of entries in the inverted index to find all the matching entries, and then collect the corresponding document IDs on a per-entry basis. The only difference between them and prefix queries is that they can support more complex schemas.

This also means that there is the same risk of using them. It is very resource-intensive to run such queries on a field that contains many different entries. Avoid using a pattern that starts with a wildcard character (for example, *foo or regular expressions:. *foo).

Although for prefix matching, you can prepare your data during the index to make it more efficient, the wildcard wildcards regular expression matching can only be done during the query. Although the usage scenarios are limited, these queries also have their application.

Attention

Prefix,wildcard and RegExp queries are based on the entry. If you use them on a analyzed field, they check each entry in the field, not the entire field.

For example, suppose our title field contains "Quick Brown Fox", which produces entries Quick,brown and Fox.

This query can match:

"regexp""title" "br.*"  }}

And not match:

{ "regexp" : { "title" : "qu.*" }} { "regexp" : { "title" : "quick br*" }}


[Elasticsearch] Partial match (ii)-wildcard character and regular expression query

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.