A simple Lucene example

Source: Internet
Author: User

Lucene can be divided into two types: index creation and search content.

I. Create an index for five basic classes: Document, field, indexwriter, analyzer, directory

1. Document class: used to describe a document. The document here can refer to an HTML page, an email, or a text file. A document object consists of multiple field objects. You can think of a document object as a record in the database, and each Field object is a record field.

2. Field Class: used to describe a certain attribute of a document. For example, two field objects can be used to describe the title and content of an email.

3. analyzer class: this class is used to perform word segmentation on the document content. analyzer class is an abstract class that has multiple implementations. You need to select an appropriate analyzer for different languages and applications, analyzer submits the segmented content to indexwriter to create an index.

4. indexwriter class: a core class used to create an index. It is used to add document objects to the index.

5. Directory class: the abstract class indicating the storage location of the index file. There are two common subclasses:

* Fsdirectory-directory implementation for storing indexes in the actual file system. This type is very useful for large indexes.
* Ramdirectory-the implementation of storing all indexes in memory. This class applies to small indexes and can be fully loaded into the memory.ProgramDestroyed after termination. Because the index is stored in the memory, the speed is relatively fast.

 

2. four basic types of search content: searche, term, query, and topdocs

1. searcher is an abstract base class that contains various overload search methods. Indexsearcher is a common subclass that allows you to store search indexes in a given directory. The search method returns a set of documents sorted by scores. Lucene calculates scores for each document matching a given query. Indexsearcher is thread-safe. One instance can be used concurrently by multiple threads.

2. term is the basic unit of search. It consists of two parts: the word text and the name of the field that appears the text. The term object also involves indexing, but can be created in Lucene.

3. query and subclass

Query is an abstract base class used for queries. To search for a specified word or phrase involves wrapping them in the item, adding the item to the query object, and passing the query object to the indexsearcher search method.

Lucene contains various types of specific query implementations, such as termquery, booleanquery, phrasequery, prefixquery, rangequery, multitermquery, filteredquery, and spanquery. The following section describes the main Query Class of Lucene query API.

4. Total number of search results and scoredoc encapsulated by topdocs

 
The simplest Lucene example shows the simple usage of indexwriter, directory, analyzer, document, field, indexsearcher, term, query, termquery, and topdocs in Lucene.

Package org. Apache. Lucene. Demo;

Import java. Io. ioexception;

Import org. Apache. Lucene. analysis. simpleanalyzer;
Import org.apache.e.doc ument. Document;
Import org.apache.e.doc ument. field;
Import org. Apache. Lucene. Index. indexwriter;
Import org. Apache. Lucene. Index. term;
Import org. Apache. Lucene. Search. indexsearcher;
Import org. Apache. Lucene. Search. query;
Import org. Apache. Lucene. Search. termquery;
Import org. Apache. Lucene. Search. topdocs;
Import org. Apache. Lucene. Store. ramdirectory;

Public class deleetest {

Public static void main (string [] ARGs) throws ioexception {
Ramdirectory directory = new ramdirectory ();
Indexwriter writer =
New indexwriter (directory, new simpleanalyzer (), true, indexwriter. maxfieldlength. Unlimited );

Document Doc = new document ();
Doc. Add (new field ("partnum", "q36", field. Store. Yes, field. Index. not_analyzed ));
Doc. Add (new field ("Description", "illidium Space Modulator", field. Store. Yes, field. Index. Analyzed ));
Writer. adddocument (DOC );
Writer. Close ();

Indexsearcher searcher = new indexsearcher (directory );
Query query = new termquery (new term ("partnum", "q36 "));
Topdocs rs = searcher. Search (query, null, 10 );
System. Out. println (Rs. totalhits );

Document firsthit = searcher.doc(rs.scoredocs=02.16.doc );
System. Out. println (firsthit. getfield ("partnum"). Name ());
}
}

 

More> http://wiki.apache.org/lucene-java/HowTo

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.