Lucene-2.0 learning document (2)

Source: Internet
Author: User

The following describes how to create an index.

In fact, from the above example, we can see that document, indexwriter, and field are used to create an index.

The simplest step is:

First, create a new document, indexwriter, and field respectively.

Then add the field using doument. Add,

Second, useIndexwrtier. adddocument ()Add MethodDocument.

Finally, call indexwriter. the close () method disables the input index. This step is very important. Only the index of this method can be written into the index directory, which is ignored by many beginners.

Document does not have much to introduce. It can be regarded as a row of records in the database.

Field is important and complex:

Let's take a look at its constructor with five:

Field(String name, byte[] value, Field.Store store)

Field(String name, Reader reader)

Field(String name, Reader reader, Field.TermVector termVector)

Field(String name, String value, Field.Store store, Field.Index index)

Field(String name, String value, Field.Store store, Field.Index index, Field.TermVector termVector)

There are three internal classes in field: field. index, field. Store, field. termvector, and they are also used by constructors.

Note:termVectorIs Lucene 1.4.It is not commonly used to provide a vector mechanism for Fuzzy queries. The default value is false, but it does not affect general queries.

Their different combinations play different roles in full-text search. Let's look at the following table:

Field. Index

Field. Store

Description

TOKENIZED(Word Segmentation)

YES

The title or content of an article (if the content is not too long) can be searched.

TOKENIZED

NO

The title or content of an article (the content can be very long) can also be viewed.

NO

YES

This cannot be searched. It is only an attachment to the searched content. Such as URL

UN_TOKENIZED

YES/NO

Not segmented. It is searched as a whole and cannot be searched.

NO

NO

No such usage

ForField(String name, Reader reader)

Field(String name, Reader reader, Field.TermVector termVector)

They are field. Index. tokenized and field. Store. No. This is why the content in the above example is null. Because it is indexed but not stored. If you want to see the content of the article, you can get it through the path of the Article. After all, the path of the article is searched out as an attachment to the search. In web development, we usually place big data in the database, not in the file system, or in the index directory, because the operation is too large, it will increase the burden on the server..

The following describes indexwriter:

It is an index writer, and its tasks are relatively simple:

1. Use adddocument () to add documents that are prepared to write the index

2. Call close () to write the index to the index directory.

Let's take a look at its constructor:

IndexWriter(Directory d, Analyzer a, boolean create)

(Unfinished)

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.