Elasticsearch cluster configuration and Management tutorial

Source: Internet
Author: User

elasticsearch Cluster Server configuration

First, installation

Elasticsearch is based on lence, and Lence is an open source library written in Java that relies on the Java runtime environment. The Elasticsearch version currently in use is 1.6, which requires jdk1.7 or more versions.
This article uses the Linux system, the installation of a good Java environment, the download down, after decompression directly to perform the boot on it.

1. Installation Start Elasticsearch:
CD to the directory where elasticsearch-1.6.0.tar.gz is placed,
Decompression TAR-XVF elasticsearch-1.6.0.tar.gz
Starting./elasticsearch-1.6.0/bin/elasticsearch, viewing startup information, mentions that the default port for HTTP is the default port of 9200,transport is 9300, which is very important.

Next you can enter the command in terminal to view some basic information
View Cluster
Curl ' localhost:9200/_cat/health?v '
View nodes
Curl ' localhost:9200/_cat/nodes?v '
View Index
Curl ' localhost:9200/_cat/indices?v '
All of the above information can be viewed in http://localhost:9200/_plugin/head/after the head plugin is installed

2. Install Head Plugin
CD to Elasticsearch-1.6.0/bin directory, run./plugin-install Mobz/elasticsearch-head,
After the Elasticsearch is installed and started, the http://localhost:9200/_plugin/head/is opened in the browser to see the ES Cluster, node, index, data, and so on.


Second, start

1. Start with script

1 Bin/elasticsearch, not too much parameter, default in front start

2) bin/elasticsearch-d, with parameter-D, indicating that the service thread is started in the background

You can also set more parameters: Bin/elasticsearch-xmx2g-xms2g-des.index.store.type=memory--node.name=my-node

Note: If the LAN is running Elasticsearch cluster is also very simple, as long as the cluster.name set consistent, and the machine under the same network segment, boot es will automatically find each other, composed of clusters.


2.elasticsearch-servicewrapper

1) Installation

Download the Github,https://github.com/elastic/elasticsearch-servicewrapper and copy the service to the Es_home/bin directory.

2) Use

Es_home/bin/service/elasticsearch + console/start/stop ...


Parameter

Description

Console

Runthe Elasticsearch in the foreground.

Start

Runelasticsearch in the background.

Stop

Stopselasticsearch if its running.

Install

Installelasticsearch to run on system startup (Init.d/service).

Remove

Removeselasticsearch from System startup (Init.d/service).


In the service directory there is a elasticsearch.conf configuration file, mainly set up some Java Runtime environment parameters, which is more important in the following

Parameters:

#es的home路径, you don't have to use the default value.
Set.default.es_home=<pathto Elasticsearch home>

#分配给es的内存大小

set.default.es_heap_size=1024


#启动等待超时时间 (in seconds)
wrapper.startup.timeout=300

#关闭等待超时时间 (in seconds)

wrapper.shutdown.timeout=300

#ping超时时间 (in seconds)

wrapper.ping.timeout=300



Third, the configuration shallowly relates

Elasticsearch Config folder contains two profiles: Elasticsearch.yml and Logging.yml, the first is the basic ES profile, the second is the log configuration file, ES is also used to log the use of log4j, so logging.yml Set the normal log4j profile to set the line. The following is a brief explanation of what is configurable in this elasticsearch.yml file.


Cluster.name:elasticsearch
Configure ES cluster name, default is Elasticsearch,es will automatically find in the same network section of the ES, if there are multiple clusters under the same network segment, you can use this attribute to distinguish between different clusters.

Node.name: "Franzkafka"
The node name, which is randomly assigned a name in the name list, which is in the Name.txt file in the Config folder in the ES jar bundle, with many interesting names added by the author.

Node.master:true
Specifies whether the node is eligible to be elected as node, by default True,es is the first machine in the default cluster and will be returned to master if the machine hangs.

Node.data:true
Specifies whether the node stores index data, and defaults to true.

Index.number_of_shards:5
Sets the default index fragment number, which defaults to 5 slices.

Index.number_of_replicas:1
Sets the number of default index replicas, which defaults to 1 replicas.

Path.conf:/path/to/conf
Sets the storage path for the configuration file, which defaults to the Config folder in the ES root directory.

Path.data:/path/to/data
Set the storage path for index data, default is the Data folder under the ES root directory, you can set multiple storage paths, separated by commas, example:
Path.data:/path/to/data1,/path/to/data2

Path.work:/path/to/work
Sets the storage path for the temporary file, which defaults to the work folder in the ES root directory.

Path.logs:/path/to/logs
Sets the storage path for the log file, which defaults to the logs folder in the ES root directory

Path.plugins:/path/to/plugins
Set the storage path for the plug-in, default is the Plugins folder under the ES root directory

Bootstrap.mlockall:true
Set to True to lock memory. Because the efficiency of the ES is reduced when the JVM begins to swapping, it is possible to set the ES_MIN_MEM and ES_MAX_MEM two environment variables to the same value, and to ensure that the machine has enough memory allocated to ES. Also allow Elasticsearch process to lock memory, Linux can be through the ' ulimit-l Unlimited ' command.

network.bind_host:192.168.0.1
Sets the IP address of the binding, which can be IPv4 or IPv6, and defaults to 0.0.0.0.


network.publish_host:192.168.0.1
Set the IP address of the other node interacting with the node, if not set it automatically determines that the value must be a real IP address.

network.host:192.168.0.1
This parameter is used to set both the Bind_host and publish_host above two parameters.

transport.tcp.port:9300
Sets the TCP port for interaction between nodes, which defaults to 9300.

Transport.tcp.compress:true
Sets whether to compress data for TCP transmissions by default, not compression.

http.port:9200
Sets the HTTP port for external services, which defaults to 9200.

http.max_content_length:100mb
Set the maximum capacity of content, default 100MB

Http.enabled:false
Whether to use the HTTP protocol to provide services externally, default to True, open.

Gateway.type:local
The type of gateway, the default is local file system, can be set to local file system, Distributed File System, Hadoop HDFs, and Amazon's S3 server, the other file system settings method next time to elaborate.

Gateway.recover_after_nodes:1
Set the n nodes in the cluster to start with data recovery by default of 1.

Gateway.recover_after_time:5m
Sets the timeout for initializing the data recovery process, which defaults to 5 minutes.

Gateway.expected_nodes:2
Set the number of nodes in this cluster, the default is 2, and once the N nodes are started, data recovery is done immediately.

Cluster.routing.allocation.node_initial_primaries_recoveries:4
The number of concurrent recovery threads, which defaults to 4, when data recovery is initialized.

Cluster.routing.allocation.node_concurrent_recoveries:2
Number of concurrent recovery threads when adding a delete node or load balancing, default is 4.

indices.recovery.max_size_per_sec:0
Set the bandwidth that is limited when data is restored, such as 100MB, default is 0, that is, unrestricted.

Indices.recovery.concurrent_streams:5
Set this parameter to limit the maximum number of concurrent streams to open concurrently when recovering data from other fragments, by default 5.

Discovery.zen.minimum_master_nodes:1
Set this parameter to ensure that the nodes in the cluster know the other n master-qualified nodes. The default is 1, for large clusters, you can set a larger value (2-4)

Discovery.zen.ping.timeout:3s
Setting the ping connection timeout when automatically discovering other nodes in the cluster defaults to 3 seconds, which can be used to prevent automatic discovery errors for values that are higher than the poor network environment.

Discovery.zen.ping.multicast.enabled:false
Sets whether multicast discovery nodes are turned on, which is true by default.

discovery.zen.ping.unicast.hosts:["Host1", "Host2:port", "Host3[portx-porty"]
Sets the initial list of master nodes in the cluster, through which nodes can be automatically discovered to join the cluster


iv. cluster configuration multiple nodes

1. Overview

The cluster for this example will deploy 4 nodes:

10.0.0.11 es1

10.0.0.209 Es2

10.0.0.206 Es3

10.0.0.208 ES4


2. Cluster configuration

As mentioned above, as long as the cluster name is the same and the machine is on the same network segment of the same LAN, ES automatically discovers the other nodes.


Configuration of 2.1es2

Vimes_home/config/elasticsearch.yml


Add content to the end of the file:

Cluster.name:elasticsearch #集群的名称, same cluster The value must be set to the same

Node.name: "Es2" #该节点的名字

Node.master:true #该节点有机会成为master节点

Node.data:true #该节点可以存储数据

Node.rack:rack2 #该节点所属的机架

Index.number_of_shards:5 #shard的数目

Index.number_of_replicas:3 #数据副本的数目

network.bind_host:0.0.0.0 #设置绑定的IP地址 can be IPV4 or IPV6.

network.publish_host:10.0.0.209 #设置其他节点与该节点交互的IP地址

network.host:10.0.0.209 #该参数用于同时设置bind_host和publish_host

transport.tcp.port:9300 #设置节点之间交互的端口号

Transport.tcp.compress:true #设置是否压缩tcp上交互传输的数据

http.port:9200 #设置对外服务的http端口号

HTTP.MAX_CONTENT_LENGTH:100MB #设置http内容的最大大小

Http.enabled:true #是否开启http服务对外提供服务

Discovery.zen.minimum_master_nodes:2 #设置这个参数来保证集群中的节点可以知道其它N个有master资格的节点. The default is 1, for large clusters, you can set a larger value (2-4)

Discovery.zen.ping.timeout:120s #设置集群中自动发现其他节点时ping连接的超时时间

Discovery.zen.ping.multicast.enabled:true #设置是否打开多播发现节点

discovery.zen.ping.unicast.hosts:["10.0.0.209:9300", "10.0.0.206:9300", "10.0.0.208:9300"] #设置集群中的Master节点的初始列表, You can use these nodes to automatically discover other new nodes that join the cluster


Configuration of 2.2es3

Similarly, on a 206 machine.

Vimes_home/config/elasticsearch.yml


Add content to the end of the file:

Cluster.name:elasticsearch #集群的名称, same cluster The value must be set to the same

Node.name: "Es3" #该节点的名字

Node.master:true #该节点有机会成为master节点

Node.data:true #该节点可以存储数据

Node.rack:rack3 #该节点所属的机架

Index.number_of_shards:5 #shard的数目

Index.number_of_replicas:3 #数据副本的数目

network.bind_host:0.0.0.0 #设置绑定的IP地址 can be IPV4 or IPV6.

network.publish_host:10.0.0.206 #设置其他节点与该节点交互的IP地址

network.host:10.0.0.206 #该参数用于同时设置bind_host和publish_host

transport.tcp.port:9300 #设置节点之间交互的端口号

Transport.tcp.compress:true #设置是否压缩tcp上交互传输的数据

http.port:9200 #设置对外服务的http端口号

HTTP.MAX_CONTENT_LENGTH:100MB #设置http内容的最大大小

Http.enabled:true #是否开启http服务对外提供服务

Discovery.zen.minimum_master_nodes:2 #设置这个参数来保证集群中的节点可以知道其它N个有master资格的节点. The default is 1, for large clusters, you can set a larger value (2-4)

Discovery.zen.ping.timeout:120s #设置集群中自动发现其他节点时ping连接的超时时间

Discovery.zen.ping.multicast.enabled:true #设置是否打开多播发现节点

discovery.zen.ping.unicast.hosts:["10.0.0.209:9300", "10.0.0.206:9300", "10.0.0.208:9300"] #设置集群中的Master节点的初始列表, You can use these nodes to automatically discover other new nodes that join the cluster


The ES configuration on the 2.3 208 machine refers to the above two nodes


2.4 Validation Results

Start node:

Es_home/bin/service/elasticsearchstart


After the individual nodes have been successfully started, the browser opens the http://10.0.0.209:9200/_plugin/head/and the interface lists the information for each node.


3. Node additions and deletions

3.1 Adding nodes is very simple, almost consistent with the steps above to set up a node.

On the 10.0.0.11 machine, vimes_home/config/elasticsearch.yml


Cluster.name:elasticsearch #集群的名称, same cluster The value must be set to the same

Node.name: "ES5" #该节点的名字

Node.master:false #该节点有机会成为master节点

Node.data:true #该节点可以存储数据

Node.rack:rack5 #该节点所属的机架

Index.number_of_shards:5 #shard的数目

Index.number_of_replicas:3 #数据副本的数目

network.bind_host:0.0.0.0 #设置绑定的IP地址 can be IPV4 or IPV6.

network.publish_host:10.0.0.11 #设置其他节点与该节点交互的IP地址

network.host:10.0.0.11 #该参数用于同时设置bind_host和publish_host

transport.tcp.port:9300 #设置节点之间交互的端口号

Transport.tcp.compress:true #设置是否压缩tcp上交互传输的数据

http.port:9200 #设置对外服务的http端口号

HTTP.MAX_CONTENT_LENGTH:100MB #设置http内容的最大大小

Http.enabled:true #是否开启http服务对外提供服务

Discovery.zen.minimum_master_nodes:2 #设置这个参数来保证集群中的节点可以知道其它N个有master资格的节点. The default is 1, for large clusters, you can set a larger value (2-4)

Discovery.zen.ping.timeout:120s #设置集群中自动发现其他节点时ping连接的超时时间

Discovery.zen.ping.multicast.enabled:true #设置是否打开多播发现节点

discovery.zen.ping.unicast.hosts:["10.0.0.209:9300", "10.0.0.206:9300", "10.0.0.208:9300"] #设置集群中的Master节点的初始列表, You can use these nodes to automatically discover other new nodes that join the cluster


Write the configuration to start this ES node.

To view the status of a cluster:

Http://10.0.0.11:9200/_nodes

Elasticsearch automatically discovers nodes in a broadcast way, and it takes a while to wait for a new node to be discovered:

Wait patiently ... Finally, you can see the information for each node in this interface.


3.2 Node Deletion

On the machine that wants to delete the node, run Es_home/bin/service/elasticsearchstop, wait for a while, look at the cluster state, and you will find that the node is gone.



Elasticsearch Cluster Management


By setting the "Name of the node" and the "name of the cluster", ES can be added to the cluster by automatically organizing the nodes of the same cluster name, and make a lot of technology transparent to the user.
If a user wants to manage the status of a cluster, it can be done with some rest APIs.
Other ES documents translation reference: Elasticsearch documentation Summary


REST API Usage

ES provides a wide range of APIs, which can be roughly divided into the following categories:

1 Check the health of clusters, nodes, indexes

2 Manage clusters, nodes, index data, meta data

3 perform crud, create, read, update, delete, and query

4 Perform advanced query operations, such as paging, sorting, scripting, aggregation, etc.

View cluster status

The rest command can be sent through the Curl command to query the health status of the cluster:

Curl ' localhost:9200/_cat/health?v '

LocalHost is the host's address, 9200 is the listener's port number, the ES default listener's port number is 9200.

Note that the curl installed under Windows may not support single quotes, and if there are errors, change them into double quotes and escape characters internally.

The corresponding result obtained:

Epoch Timestamp cluster status node.total node.data shards pri relo init unassign
1394735289 14:28:09 elasticsearch Green 1 1 0 0 0 0 0

You can see that the cluster name is the default "Elasticsearch", and the cluster status is "green". This color has been said before:

1 green, healthiest state, representing all slices including backups available

2 yellow, basic fragmentation available, but backup unavailable (or no backup possible)

3 Red, part of the fragmentation is available, indicating a part of the fragmentation damage. At this point the execution of the query part of the data can still be found, encountered this situation, or quickly resolved better.

The results above can also be seen, there is a node, but there is no fragmentation, this is because we have no data in the ES, once there is no fragmentation.



When using Elasticsearch as the cluster name, a unicast is used to query whether other nodes are still running on this machine. If so, a cluster is formed.

(If you use a different name as a cluster name, you may be using multicast!) This in the work, often encounter, we use a cluster name, fragmentation is always made together, causing someone's machine offline, their own also can not use.



You can query the list of nodes by using the following command:

Curl ' localhost:9200/_cat/nodes?v '

The results obtained are as follows:

Curl ' localhost:9200/_cat/nodes?v '
Host IP heap.percent ram.percent load node.role Master Name
MWUBUNTU1 127.0.1.1 8 4 0.00 D * New Goblin


View all Indexes

The index in ES has two meanings:

1 The index of the verb that stores the data in ES and provides the process of searching; During this time, a process of creating a search may be executing.

2-name index, which is a storage type in ES, similar to the database, contains the Type field internally, and contains various documents in the type.

You can view all indexes by using the following command:

Curl ' localhost:9200/_cat/indices?v '

The results obtained are as follows:

Curl ' localhost:9200/_cat/indices?v '
Health Index PRI Rep docs.count docs.deleted Store.size pri.store.size

Because there is no data in the cluster, the results above contain only the column information.

Create an index

The following are examples of creating indexes and querying indexes:


Curl-xput ' Localhost:9200/customer?pretty '
{
' acknowledged ': true
}

Curl ' localhost:9200/_cat/indices?v '
Health Index PRI Rep docs.count docs.deleted Store.size pri.store.size
Yellow Customer 5 1 0 0 495b 495b



In the results above, the status of the customer index is yellow, because there are 5 primary slices and one backup at this time. But because it's just a single node, our fragmentation is still running and cannot be modified dynamically. So when other nodes are added to the cluster, the backed-up nodes are copied to another node, and the state becomes green.

indexing and searching documents

As I said before, there is a concept of type in the index, which is to set the type before indexing the document.

The following commands are executed:

Curl-xput ' Localhost:9200/customer/external/1?pretty '-d '
{
"Name": "John Doe"
}'

After successful execution, you will receive the following information:


{
"_index": "Customer",
"_type": "External",
"_id": "1",
"_version": 1,
' Created ': true
}



Note that the 2.0 version of ES under the same index, different types, the same field names, are not allowed field types to be inconsistent.

In the example above, a document was created for us and the ID is automatically set to 1.

Es do not need to explicitly create an index before indexing the document, and if the above command is executed, the index does not exist, and the index is automatically created.

Execute the following command query and return the following information:


Curl-xget ' Localhost:9200/customer/external/1?pretty '
{
"_index": "Customer",
"_type": "External",
"_id": "1",
"_version": 1,
' Found ': true, ' _source ': {' name ': ' John Doe '}
}



Here are two new fields:

1 found describes the request information

2 _source The data before the index

Delete Index

You can delete an index by executing the following command:

Curl-xdelete ' Localhost:9200/customer?pretty '

return Result:

{
' acknowledged ': true
}


Summary

The summary of the above mentioned commands is as follows:

Curl-xput ' Localhost:9200/customer '//CREATE INDEX
Inserting data
Curl-xput ' LOCALHOST:9200/CUSTOMER/EXTERNAL/1 '-d '
{
"Name": "John Doe"
}'
Curl ' LOCALHOST:9200/CUSTOMER/EXTERNAL/1 '//query data
Curl-xdelete ' Localhost:9200/customer '//delete index

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

Tags Index: