International - English

Cart Console

Topic Center

Contact Sales

Home > Developer > Go

Go The problem of brain fissure in Elasticsearch cluster

Last Update:2016-06-20 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Transfer from http://blog.csdn.net/cnweike/article/details/39083089

The so-called brain fissure problem (similar to schizophrenia) is a different node in the same cluster, which has a distinct understanding of the state of the cluster.

Today, the Elasticsearch cluster has an extremely slow query, viewing the cluster status with the following command:

Curl-xget ' Es-1:9200/_cluster/health '

found that the overall state of the cluster is red, originally 9 nodes of the cluster, in the results only showed 4; however, after the request was sent to a different node, I found that even though the overall status was red, the number of available nodes was inconsistent.

Under normal circumstances, all nodes in the cluster should be consistent with the selection of the master in the cluster, so that the state information obtained should be consistent and inconsistent state information, indicating that the choice of the master node by different nodes is abnormal--the so-called brain fissure problem. Such a cleft state directly causes the node to lose the correct state of the cluster, causing the cluster to not work properly.

Possible causes:

1. Network: Because it is the intranet communication, the network communication problem causes some nodes to think that the master is dead, but the possibility of alternative master is small, and then check the ganglia cluster monitoring, also did not find abnormal intranet traffic, so the reason can be excluded.

2. Node load: Because the master node and the data node are mixed together, so when the workload of the work node is large (and indeed larger), the corresponding ES instance stops responding, and if this server is acting as the master node's identity, Then a part of the node will think that the master node is invalid, so re-election of new nodes, then there is a brain fissure, and because the ES process on the data node occupies a large amount of memory, large-scale memory recycling operations can also cause the ES process to lose response. Therefore, the likelihood of this reason should be the largest.

The answer to the problem:

1. In response to the above analysis, it is inferred that the reason for this is that the node load caused the master process to stop responding, which led to some differences in the choice of master for some nodes. For this reason, an intuitive solution is to separate the master node from the data node. To do this, we add three servers to the ES cluster, but their roles are only the master node, not the role of storage and search, so they are relatively lightweight processes. The roles can be restricted by the following configuration:

[plain] view plain copy

Node.master:true

Node.data:false

Of course, the other nodes can no longer be master, the above configuration can be reversed. This makes it possible to detach the master node from the data node. Of course, in order for the newly joined node to quickly determine the master location, the default master Discovery mode of the data node can be multicast modified to unicast:

[plain] view plain copy

Discovery.zen.ping.multicast.enabled:false

Discovery.zen.ping.unicast.hosts: ["Master1", "Master2", "Master3"]

2. There are also two intuitive parameters that can slow the emergence of a brain fissure problem:

Discovery.zen.ping_timeout (default is 3 seconds): By default, a node will think that if the master node does not answer within 3 seconds, then the node is dead, and increasing this value will increase the time that the node waits for a response. To a certain extent will reduce the miscarriage of judgment.

Discovery.zen.minimum_master_nodes (default is 1): This parameter controls the minimum number of qualifications that a node needs to see for the master node before it can operate in the cluster. The official recommended value is (N/2) +1, where N is the number of nodes with the master qualification (our case is 3, so this parameter is set to 2, but for the case of only 2 nodes, set to 2 is a bit of a problem, after a node down, you must not connect to 2 servers, this need attention).

Go The problem of brain fissure in Elasticsearch cluster

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

aws elasticsearch cluster setup elasticsearch cluster elasticsearch cluster health elasticsearch cluster status elasticsearch cluster setup use of go 5 in sql how to find number of nodes in hadoop cluster

Go combat--golang Get public IP, view intranet IP, detect IP ... 07-26

Golang client Sarama via SSL connection Kafka configuration 03-20

Golang private Key "encrypt" public key "decrypt" 07-01

Golang in net package usage (i) 06-17

Go Navicat Premium 12.1.8.0 Installation and activation 10-23

Go adobe Creative Cloud 2015 download adobe CC Download 06-17

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Go The problem of brain fissure in Elasticsearch cluster

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support