International - English

Cart Console

Topic Center

Contact Sales

Home > Others

Lucene Query Method Introduction

Last Update:2018-07-26 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This paper first introduces some of the production of Lucene entity class introduction. This paper focuses on the centralized query method of Lucene.

1, Analysis: Word breaker

The analysis includes some built-in parsers, such as the whitespaceanalyzer of Word segmentation by whitespace, adding stopwrod filtered Stopanalyzer, the most commonly used standardanalyzer.

2, Documet: Documents

Is our source data encapsulation structure, we need to divide the source data into different domains, put into the documet inside, when the search can also specify which fields (field).

3. Directory: Catalogue

This is an abstraction of the directory, which can be a dir (fsdirectory) on the file system, or a piece of memory (ramdirectory), mmapdirectory an index that uses memory mappings. If you put it in memory, you will avoid the IO operation and choose it as needed.

4, IndexWriter : Index writer, that is, the maintenance of the index to read and delete operations of the class

5, Indexreader : Index Reader, for reading the index of the specified directory.

6, Indexsearcher : Index of the search engine, is the user input to the index list to search for a class

It should be noted that this search is the (topdocs) index number, is not a real article.

7, query: Search statements, we need to our query string encapsulated into query can be handed to searcher to search, the smallest unit of inquiry is Term,lucene query there are many kinds of, According to different needs of different query is the choice.

I. Termquery:

If you want to execute a query like: "Include the document of Lucene in the Content field," You can use Termquery:

Term t = new Term ("Content", "Lucene"); Query query = new Termquery (t);

Ii. booleanquery: Queries for "and or" relationships in multiple query

If you want to query this: "Include Java or Perl document in the Content field," You can create two termquery and connect them with Booleanquery:

Termquery termQuery1 = new Termquery (new Term ("content", "Java"); Termquery termquery 2 = new Termquery (new Term ("Content", "Perl"); Booleanquery booleanquery = new Booleanquery (); Booleanquery.add (TermQuery1, BooleanClause.Occur.SHOULD); Booleanquery.add (TermQuery2, BooleanClause.Occur.SHOULD);

Iii. wildcardquery: Wildcard Query

If you want to make a wildcard query on a word, you can use Wildcardquery, wildcard characters include '? ' Match an arbitrary character and ' * ' match 0 or more arbitrary characters, such as you search ' use* ', you may find ' useful ' or ' useless ':

Query query = new Wildcardquery (new Term ("Content", "use*");

Iv. phrasequery: query for words appearing within the specified text distance

You may be interested in the relationship between China and Japan, to find the ' middle ' and ' Day ' close (5 words within the distance) of the article, beyond this distance is not considered, you can:

Phrasequery query = new Phrasequery ();

Query.setslop (5);

Query.add (New Term ("Content", "medium"));

Query.add (New Term ("Content", "Day"));

Then it may search "Sino-Japanese cooperation ...", "China and Japan ...", but not found "a senior Chinese leader said Japan is not flat".

V. Prefixquery: The query word begins with a character

If you want to search for words that start with ' in ', you can use Prefixquery:

Prefixquery query = new Prefixquery (new Term ("Content", "medium");

Vi. Fuzzyquery: Similar search

Fuzzyquery is used to search for similar term, using the Levenshtein algorithm. If you want to search for words similar to ' Wuzza ', you can:

Query query = new Fuzzyquery (new Term ("Content", "Wuzza");

You may get ' fuzzy ' and ' Wuzzy '.

Vii. termrangequery: In-scope search

You may want to search the document from 20060101 to 20060130 in the time domain, and you can use Termrangequery:

Termrangequery Query2 = Termrangequery.newstringrange ("Time", "20060101", "20060130", true, true);

The last true indicates a closed interval.

8, Topdocs: result set, is the result of searcher search, inside is some scoredoc, this object's DOC member is this ID.

To get an article, you need to use this ID to fetch the article, Searcher provides a way to obtain the document with ID, and then you have the data.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

OpenGL Series Tutorial Eight: OpenGL vertex buffer Object (VBO) 07-26

Methods for generating various waveform files Vcd,vpd,shm,fsdb 02-11

Mac Ping:sendto:Host is down Ping does not pass other people'... 09-01

Solution to the problem that WordPress cannot be opened after... 12-05

(SOLR is successfully installed on the office machine accordi... 12-07

Webmaster resources (site creation required) 12-07

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Lucene Query Method Introduction

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support