Some problems in the production environment of Elasticsearch

Source: Internet
Author: User

1. Low water disk

When a node has more than 85% hard disk space, ES will no longer allocate replica to that node. When this restarts, the cluster status is always yellow and there are some unassigned shards. The cluster configuration can be done without downtime at this time.

PUT _cluster/settings
{
"Transient": {
"Cluster.routing.allocation.disk.watermark.low": "80%",
"Cluster.routing.allocation.disk.watermark.high": "50GB",
}
}


Use a percentage or GB number, note that the value of low is generally higher than high.


2. Org.elasticsearch.common.breaker.CircuitBreakingException: [FIELDDATA] Data too large

This data too large is also a frequent problem, the reason is that the ES configuration refers to the fielddata referred to as the field data. When sorting (sort), statistics (Aggs), ES involves all of the field data read into the memory (JVM Heap) to operate. The equivalent of data caching to improve query efficiency.

The default use of 60% of memory in the JVM is the upper limit, when Memory_size_in_bytes uses the available upper limit, while evictions (expulsion) is 0. The current cache is in a state that cannot be effectively evicted, the new cache data cannot go in, the old cache data is not expelled, ES will error.

Nutshell:
Indices.fielddata.cache.size Configure the cache size of the Fielddata, you can match a percentage or an exact value. The cache automatically cleans up when it reaches the agreed memory size, expelling part of the Fielddata data to accommodate the new data. The default value is unbounded infinity.
Indices.fielddata.cache.expire is used to stipulate how long the data not accessed will be evicted, the default value is-1, which is infinite. Expire configuration is not recommended, and time-out data can consume a lot of performance. And this setting will be deprecated in a later version.

It appears that the Data too large exception is caused by the default value of Fielddata.cache for unbounded.

This is a bug in the config file, so add indices.fielddata.cache.size:40% to the config file, and when 40% is used, the old cache data is evicted.

Note the modification.


3, BOOL query and the difference between bool filter

First, a bool query is a concatenation of multiple subquery statements, and each subquery is an and relationship. ES is considered a match to a document only if a document satisfies all the subquery criteria in the Boolean query. There are four types of subquery supported by Boolean queries, namely: Must,should,must_not and filter:

Must:and

Should:or, one or more

Must_not: Non-

Filter: Filters, must match the filter, the difference with must is that it does not affect score.

Among them, the most notable place is the should. Typically, the should clause is an array field that contains multiple should subqueries, and by default the matching document must satisfy one of the subquery criteria. If the query needs to change the default matching behavior, the query DSL must explicitly set the value of the parameter Minimum_should_match of the Boolean query, which controls the number of should subqueries that a document must match, and what is currently encountered is if the specified parameter is not displayed, Even if the should clause contains two queries, the default value is 0 if the parameter minimum_should_match is not set. It is recommended that you display the value of the set parameter Minimum_should_match in a Boolean query.

It is likely that the query result is inconsistent with the imagined result, so if you write a query, you must specify a parameter of 1.

I just can't stand it.

That is, if you do not specify a minimum_should_match, the score of the document that matches the should will be higher, that's all.


4. Phrase Query Multi_phrase

When you need to look for several words in the neighborhood, you will use the Match_phrase query:


Get/my_index/my_type/_search
{
"Query": {
"Match_phrase": {
"title": "Quick brown fox"
}
}
}
Similar to the match query, the Match_phrase query parses the query string first to produce a list of entries. All entries are then searched, but only documents containing all the search terms are kept, and the entries are placed adjacent to each other. A query against the phrase Quick Fox does not match any of our documents, because there is no document containing the quick and box entries that are contiguous together.


5. There are many different places in the new ES 2.3.1 with the original version 1.6.

Use plugin to install the head plugin online, note that there is no "-" in front of the install

Bin/plugin install Mobz/elasticsearch-head.


Root User Startup issues:

Modify in Bin/elasticsearch

Es_java_opts= "-des.insecure.allow.root=true"

Or add bin/elasticsearch-des.insecure.allow.root=true to the start command.


Network.host in the configuration file: To write the current server IP, the previous default is 192.168.0.1.


If the network communication is not good, you can specify the IP or host,discovery.zen.ping.unicast.hosts to be discovered: ["Host1", "Host2"]


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.