Advanced Search Using Lucene

Source: Internet
Author: User

Lucene supports multiple forms of advanced search, which we will discuss in this section. Then we will use the Lucene API to demonstrate how to implement these advanced search functions.

Boolean operator

Most search engines provide boolean operators that allow users to combine queries. Typical boolean operators include and, or, not. Lucene supports five boolean operators: And, or, not, plus (+), and minus (-). Next I will describe the usage of each operator.

    • Or: If you want to search for documents containing characters A or B, you need to use the OR operator. Remember that if you simply split two keywords with spaces, the search engine will automatically add the OR operator between the two keywords during the search. For example, both "Java or Lucene" and "Java Lucene" search for documents that contain Java or Lucene.
    • And: If you need to search for documents containing more than one keyword, you need to use the and operator. For example, "Java and Lucene" returns all documents that contain both Java and Lucene.
    • NotThe: Not operator does not return documents that contain keywords that follow not. For example, if you want to search for all documents that contain Java but do not contain Lucene, you can use the query statement "Java not Lucene ". However, you cannot use this operator only for a search term. For example, the query statement "not Java" does not return any results.
    • Plus sign (+): This operator is similar to and, but it only works for a search term that is followed by it. For example, if you want to search documents that contain Java but not Lucene, you can use the query statement "+ Java Lucene ".
    • Minus (-): The functions of this operator are the same as those of not. The query statement "Java-Lucene" returns all documents that contain Java but do not contain Lucene.

Field search)

Lucene supports domain search. You can specify the fields in which a query is performed. For example, if the indexed document contains two fields,TitleAndContentYou can use the query "title: Lucene and content: Java" to return all documents that contain Lucene in the title field and Java in the content field.

Wildcard search)

Lucene supports two wildcards: Question mark (?) And star number (*). You can use question mark (?) To query single-character wildcard characters, or use asterisk (*) to query multi-character wildcard characters. For example, if you want to search for tiny or Tony, you can use the query statement "t? NY "; if you want to query teach, teacher, and teaching, you can use the query statement" Teach *".

Fuzzy search

Lucene fuzzy query based on editing distanceAlgorithm(Edit distance algorithm ). You can add characters at the end of the search term ~ To perform fuzzy search. For example, the query statement "think ~" Returns all documents containing keywords similar to think.

Range search)

You can search for documents that match the value of a domain in a specified range. For example, if you query "Age: [18 to 35]", the system returns documents with a value between 18 and 35 in all age domains.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.