Tokenized,un_tokenized explained in Lucene

Source: Internet
Author: User

Field ("Content", Curart.getcontent (), field.store.no,field.index.tokenized)); These places have a lot to do with older versions.
Field has two dependencies: storage and indexing. Through storage you can control whether or not this field is stored; Through the index you can control whether the field is indexed. This seems to be a bit of a waste, in fact it is important that the two belong together correctly.
Field.index Field.store said
tokenized (noun) YES is indexed and stored
Tokenized No is indexed but not stored
No yes this is not searchable, it is only the content of the contents of the attachment. such as URLs, etc.
Un_tokenized yes/no not be divided into words, it is a whole is searched, part of the search is not found
No no no this usage
If you are looking for a field, be sure to set the Field.index to tokenized or un_tokenized. The tokenized will be divided into the content of the field, and un_tokenized will not, only the whole word match, the field is chosen.

If Field.store is no, then it is impossible to extract the value of the field directly from the index in the search results, which will cause null.

2.4 Version of the supplemental

Our article table for example. Articleinfo. Id,title (title), Sumary (abstract), content (content), UserName (user name)
where title, sumary (abstract) is the first situation, Both the index and the word, but also stored.
content is indexed, but not stored. Because he is too big, and the interface does not have to display the entire content. The
ID is stored without an index. Because no one used him to inquire. But the spelling of the URL was very much needed. The index is stored.
UserName (user name), but no word. can be saved. Why not? For example, "Genghis Khan", I do not want to be "Han" search. I would like to have "Genghis Khan" or "* Jisi *" wildcard characters searched. The
concludes as follows: 1. If you want to search for a field, be sure to set Field.index to tokenized or un_tokenized. The tokenized will be divided into the content of the field, and un_tokenized will not, only the whole word match, the field is chosen.
2. If Field.store is no, then it is impossible to extract the value of the field directly from the index in the search results, which will cause null.
Supplemental:
       field.store.yes: Stored field value (field value before the word)
       field.store.no: No storage, no relationship between storage and index
       field.store.compress: Compressed storage, used for long text or binary, but performance is damaged

Field.index. ANALYZED: Sub-glossary Index
Field.Index.ANALYZED_NO_NORMS: The noun is indexed, but the value of field is not saved as usual, but only a byte, which saves space
Field.index. not_analyzed: No word and index
Field.Index.NOT_ANALYZED_NO_NORMS: No word index, field value go to a byte to save

Termvector represents the document's article (positioned by a document and field) and the number of times they appear in the previous document
Field.TermVector.YES: Save the field for each uses an ISO (Document) termvector
Field.TermVector.NO: No storage termvector
Field.TermVector.WITH_POSITIONS: storage location
Field.TermVector.WITH_OFFSETS: Storage Offset
Field.TermVector.WITH_POSITIONS_OFFSETS: storage location and offset

Tokenized,un_tokenized explained in Lucene

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.