Understanding of ES Index Search

Source: Internet
Author: User

Type commonly used in these, there are some types of people can refer to the official website, the other is a more heavy aspect of the word (analyzer), regardless of the current search system is the decision of this search recall and algorithm and index inflation rate. The scope of Analyzer in ES is also three scopes from cluster, index, and filed. Cluster configuration in the configuration, the following field-based (in addition to the specific participle of some related things will be in the participle of the story)

Analyzer, in Lucene is a word breaker concept, we know that ES is based on Lucene, so here the Analzyer also apply, Mapping in the main analyzer in the specified field what word breaker, Specific procedures and configuration word breakers have been described in both plugins and configurations.

Analyzer in ES is divided into Index_analyzer and Search_analyzer.

Index_analzyer: Refers to the word breaker used in the indexing process

Search_analyzer: Refers to the word breaker used in the retrieval process

We know that index and search are two processes, but as far as possible to ensure that the two processes and word segmentation consistent so as to ensure that the recall and check, otherwise the Bull B participle, index and search using a different is also useless.

Related to analyzer is not a single index item

"HC": {"type": "string", "index": "No", "store": "No"}

Index indicates whether the field is indexed or not, and if index is no the analyzer is useless.

Finally, the store item indicates whether the item is stored in the inverted index, not _source, and there are many places in the item mapping that can be set up and optimized, and the meeting is discussed slowly. In the mapping Index and Store If you sometimes feel a bit and source do not know, we can refer to Lucene in the field.store.yes,field.index.not_analyzed, Field.index and other related settings are quite clear.

-----------------------------------------------
Cloud computing Platform (search article)-elasticsearch-index optimization Chapter

Es index optimization article mainly from two aspects to solve the problem, one is the index data process, the second is the retrieval process.

Index data process I have mentioned in the previous articles how to create indexes and import data, but you may encounter slow indexing data process. In fact, understand the principle of the index can be targeted to optimize. ES index process to the relative Lucene index process more distributed data extension, and this ES is mainly used tranlog for the data balance between the nodes. So from the top I can make the first optimization through the settings of the index:

"Index.translog.flush_threshold_ops": "100000"

"Index.refresh_interval": "-1",

The first of these two parameters is to balance the number of tranlog data reached, the default is 5000, and this process is relatively a waste of time and resources. So we can either turn this value up a bit or set it to 1 to close and then manually tranlog the balance. The second parameter is the refresh frequency, the default is 120s refers to the index in the life cycle of timed refresh, one but there is data come in can refresh like lucene inside commit, we know when the data adddoucment will, It is not possible to retrieve the retrieved data after a commit, so you can turn it off, refresh it manually after the initial index is finished, and then modify the Index.refresh_interval parameter in the index setting as needed to improve the efficiency of the indexing process.

Also know that if there is a copy in the ES index process, the data will be synchronized to the copy immediately. I personally recommend that you set the number of replicas to 0 during the indexing process, and change the number of copies back to the index as needed, which also improves indexing efficiency.

"Number_of_replicas": 0

I talked about it. After the optimization of the indexing process, let's talk about the slow retrieval speed, in fact, the speed of retrieval and index quality has a great relationship. The quality of indexes is related to many factors.


Http://www.cnblogs.com/zhangchenliang/p/4186702.html

Understanding of ES Index Search

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.