Elasticsearch node type

Last Update:2018-07-24 Source: Internet

Author: User

Tags resource

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

When we start an instance of Elasticsearch, at least one node is started. The connection of multiple nodes of the same cluster name makes up a cluster.
By default, each node in the cluster can handle HTTP requests and data transfers between cluster nodes, and all nodes in the cluster know all the other nodes in the cluster and can forward client requests to the appropriate nodes.
The nodes have the following types:
1. Primary (Master) node
When Node.master is set to True (the default), it is eligible to be selected as the primary node, controlling the entire cluster.

2. Data node
Node.data is set to True (default) on one node. The node saves data and performs data-related operations, such as additions, deletions, searches, and aggregations.

3. Client node
When the Node.master and node.data of a node are set to False, it can neither keep the data nor become the primary node, which is a client node that can respond to the user's situation and send related actions to other nodes.

4. Tribal nodes
When a node is configured tribe.*, it is a special client that can connect multiple clusters and perform searches and other operations on all connected clusters.

In addition to data node, Elasticsearch can also function as master and client, and for a larger, more user-focused cluster, master and client may have performance bottlenecks or even memory overruns in some extreme usage situations This causes the coexistence of the data node to fail. The recovery of data node is related to the migration, which consumes the resources of the cluster, which can cause the delay of data writing or slow down the query.
If the master and client are independent, once the problem occurs, the restart is almost instantaneous after the recovery, the user has little impact. In addition to these roles, the corresponding computing resource consumption is stripped from data node, which makes it easier to master the connection between data node resource consumption and write volume and query volume, so as to facilitate capacity management and planning.

The primary node describes the primary responsibility of the master node as it relates to cluster operations, such as creating or deleting indexes, tracking which nodes are part of the cluster, and deciding which shards to assign to the related nodes. A stable master node is very important to the health of the cluster.
By default, nodes in any cluster may be selected as the primary node. Operations such as index data and search queries consume a large amount of CPU, memory, and IO resources, and in order to ensure a stable cluster, separating the master node and the data node is a good choice. Although the master node can also coordinate nodes, route the search and add data from the client to the data node, it is best not to use these dedicated master nodes. An important principle is to do as little work as possible.
To create a separate master node, simply add the following in the configuration file:
Node.master:true
Node.data:false
To prevent data loss, configuring the Discovery.zen.minimum_master_nodes setting is critical (default is 1), and each master node should know the minimum number of master qualification nodes that form a cluster.
The explanations are as follows:
Suppose we have a cluster. There are 3 master qualification nodes, and it is possible that one of the nodes cannot communicate with other nodes when the network fails. At this time, when the discovery.zen.minimum_master_nodes is set to 1, it will be divided into two small independent clusters, when the network is good, there will be data errors or data loss situation. When the Discovery.zen.minimum_master_nodes is set to 2, there are two master qualification nodes in a network, can continue to work, the other part, because there is only one master qualification node, will not form a separate cluster, this time when the network reply, Nodes are added to the cluster from the new node.
The principle of setting this value is:
(MASTER_ELIGIBLE_NODES/2) + 1
This parameter can also be set dynamically:

PUT localhost:9200/_cluster/settings
{
"Transient": {
"Discovery.zen.minimum_master_nodes": 2
}
}

Data node Description Data node is mainly the node that stores the index data, mainly to the document to increase the deletion and check operation, aggregation operation and so on. The data node has high CPU, memory and IO requirements, it needs to monitor the state of the data nodes when optimizing, and when the resources are not enough, we need to add new nodes in the cluster. The data node is configured as follows:
Node.master:false
Node.data:true
Data node path settings, each master node and data node need to know the Shard, index, the physical storage location of metadata, path.data default bit is $ES _home/data, can be modified through the configuration file elasticsearch.yml, for example:
Path.data:/data/es/data/
This setting can also be performed on the command line, for example:
./bin/elasticsearch–path.data/data/es/data
This path is best configured separately so that the directories of the Elasticsearch directory and data are separated. When the Elasticsearch home directory is deleted, the data is not affected. The installation by RPM is separate by default.
The data Catalog can be shared by multiple nodes, and can even belong to different clusters, in order to prevent multiple nodes from sharing the same data path, you can add it in the profile elasticsearch.yml: node.max_local_storage_nodes:1
Note: Do not run different types of nodes (for example: Master, data, client) in the same data directory, which can easily lead to unexpected data loss.

Client Node DescriptionWhen both the primary node and the data node configuration are set to False, the node can only handle routing requests, process searches, distribute index operations, and so on, essentially the client node behaves as an intelligent load balancer. Independent client node is very useful in a large cluster, he coordinates the master node and the data node, the client node joins the cluster can get the state of the cluster, according to the state of the cluster can directly route the request.
Warning: Adding too many client nodes is a burden to the cluster, because the primary node must wait for an update acknowledgment for the status of each node cluster. The role of customer nodes should not be exaggerated, and data nodes can play a similar role. The configuration is as follows:
Node.master:false
Node.data:false
Tribal Node DescriptionTribal nodes can span multiple clusters, which can receive the state of each cluster and then merge into the state of a global cluster, which can read and write data on all nodes, and the tribal nodes in ELASTICSEARCH.YML are configured as follows:
Tribe:
T1:
Cluster.name:cluster_one
T2:
Cluster.name:cluster_two
T1 and T2 are arbitrary names that represent connections to each cluster. The above example configures a two cluster connection with names T1 and T2, respectively. By default, tribal nodes can connect to each cluster as a client by broadcasting. In most cases, a tribal node can operate on a cluster as if it were a single node.
Note: The following actions will differ from the single-node operation, and if the names of the two clusters are the same, the tribal nodes will only connect one of them. Since there is no master node, when setting local to true, the read operation of the master node is automatically executed, for example: cluster statistics, cluster health. Write operations at the primary node level are rejected, which should be done in a cluster. Tribal nodes can set all write operations and all metadata operations through blocks (block), for example:
Tribe:
Blocks
Write:true
Metadata:true
Tribal nodes can also be configured in the selected index block, for example:
Tribe:
Blocks
write.indices:hk*,ldn*
metadata.indices:hk*,ldn*
When multiple clusters have the same index name, by default, the tribe's nodes will select one of them. This can be configured by tribe.on_conflict setting, which can be set to exclude those indexes or to specify a fixed tribe name.

Reprint Address: http://blog.csdn.net/ljhabc1982/article/details/53994562

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More