Elasticsearch node type

Source: Internet
Author: User

When we start an instance of Elasticsearch, we start at least one node. The connection of multiple nodes of the same cluster name makes up a cluster.
By default, each node in a cluster can handle data transfers between HTTP requests and cluster nodes, and all nodes in the cluster know all the other nodes in the cluster, and can forward client requests to the appropriate nodes.
The nodes have the following types:
1, Master (Master) node
When Node.master is set to True (default), it is eligible to be selected as the primary node, controlling the entire cluster.


2. Data node
When Node.data is set to true on one node (default). The node holds data and performs data-related operations such as additions, deletions, searches, and aggregations.


3. Client node
When the Node.master and node.data of a node are set to False, it can neither keep the data nor become the master node, which serves as the client node and can respond to the user's situation and send the related operations to other nodes.


4. Tribal nodes
When a node is configured tribe.*, it is a special client that can connect multiple clusters and perform searches and other operations on all connected clusters.

Elasticsearch's data node can also serve as master and client roles, and for a larger, more user-intensive cluster, master and client may have performance bottlenecks or even memory overruns in extreme use. Thus causing the coexistence of data node failures. The failure recovery of data node involves the migration of information, the consumption of cluster resources, the delay of data writing and the slow query.
If you separate master and client, once a problem occurs, the reboot is almost instantaneous, and has little impact on the user. In addition, the corresponding computing resource consumption is stripped out from data node, which makes it easier to master the connection between the resource consumption of data node and the amount of writing and query, and facilitates capacity management and planning.

The primary (master) node explains that the primary responsibility of the master node is the content associated with the cluster operation, such as creating or deleting an index, tracking which nodes are part of the cluster, and deciding which fragments to assign to the associated nodes. The stable master node is very important to the health of the cluster.
By default, nodes in any one cluster are likely to be selected as the primary node. Such operations as indexing data and search queries consume a large amount of CPU, memory, IO resources, and to ensure a stable cluster, separating the master node and the data node is a better choice. Although the master node can also coordinate nodes, route searches and add data from clients to data nodes, it is best not to use these dedicated master nodes. An important principle is to do as little work as possible.
Creating a separate master node simply adds the following in the configuration file:
Node.master:true
Node.data:false
To prevent data loss, configuring the Discovery.zen.minimum_master_nodes setting is critical (default is 1), and each master node should know the number of the minimum number of primary qualifying nodes that form a cluster.
Explained as follows:
Suppose we have a cluster. There are 3 primary qualification nodes, and when a network fails, it is possible that one of the nodes cannot communicate with the other nodes. This time, when the discovery.zen.minimum_master_nodes is set to 1, it will be divided into two small independent clusters, when the network is good, there will be data errors or loss of data. When the Discovery.zen.minimum_master_nodes is set to 2, there are two primary qualification nodes in a network that can continue to work, and the other part, because there is only one main qualification node, will not form a separate cluster, this time when the network reply, Nodes are added to the cluster again.
The principle for setting this value is:
(MASTER_ELIGIBLE_NODES/2) + 1
This parameter can also be set dynamically:


Put Localhost:9200/_cluster/settings
{
"Transient": {
"Discovery.zen.minimum_master_nodes": 2
}
}

The Data node indicates that the data node is mainly the node that stores the index data, the main document is to be added and deleted, and the aggregation operation. Data nodes on the CPU, memory, IO requirements, in the optimization of the need to monitor the state of the data node, when the resources are not enough, need to add new nodes in the cluster. The data nodes are configured as follows:
Node.master:false
Node.data:true
Data node path settings, each primary node and data node need to know the partition, index, metadata physical storage location, path.data default bit for $ES _home/data, can be modified through the configuration file elasticsearch.yml, for example:
Path.data:/data/es/data/
This setting can also be performed on the command line, for example:
./bin/elasticsearch–path.data/data/es/data
This path is best configured separately so that the Elasticsearch directory and the data directory are separated. When the Elasticsearch home directory is deleted, the data is not affected. Installation via RPM is separated by default.
The data directory can be shared by multiple nodes, and can even belong to different clusters, in order to prevent multiple nodes from sharing the same data path, you can add the configuration file elasticsearch.yml: node.max_local_storage_nodes:1
Note: Do not run different types of nodes in the same data directory (e.g. master, data, client) this can easily result in unexpected data loss.

Client Node DescriptionWhen both the primary node and the data node configuration are set to False, the node can only handle routing requests, process searches, distribute index operations, and so on, essentially the client node is represented by an intelligent load balancer. The independent client node is very useful in a larger cluster, he coordinates the main node and the data node, the client node joins the cluster to be able to obtain the cluster the state, according to the cluster state can directly route the request.
Warning: Adding too many client nodes is a burden on the cluster because the master node must wait for an update acknowledgement of the cluster status of each node. The role of customer nodes should not be exaggerated, and data nodes can play a similar role. The configuration is as follows:
Node.master:false
Node.data:false
Tribal Node DescriptionTribal nodes can span multiple clusters, which can receive the status of each cluster and then merge into a global cluster state, which can read and write data on all nodes, and the tribal nodes in ELASTICSEARCH.YML are configured as follows:
Tribe:
T1:
Cluster.name:cluster_one
T2:
Cluster.name:cluster_two
T1 and T2 are arbitrary names that represent connections to each cluster. The above example configures two cluster connections, which are named T1 and T2. By default, tribal nodes can connect each cluster as clients by broadcasting. In most cases, a tribal node can operate on a cluster like a single node.
Note: The following actions will be different from a single node operation, and if two clusters have the same name, the tribal node will only connect one of them. Because there is no master node, when setting local to true, the read operation of the master node is performed automatically, for example: cluster statistics, cluster health. Primary node-level writes will be rejected, which should be done in one cluster. Tribal nodes can set all writes and all metadata operations through blocks (block), for example:
Tribe:
Blocks
Write:true
Metadata:true
Tribal nodes can also be configured in a selected index block, for example:
Tribe:
Blocks
write.indices:hk*,ldn*
metadata.indices:hk*,ldn*
When multiple clusters have the same index name, by default, the tribe's node selects one. This can be configured by tribe.on_conflict setting, which can be set to exclude those indexes or to specify a fixed tribe name.


Reprint Address: http://blog.csdn.net/ljhabc1982/article/details/53994562

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.