7.1 Node Discovery
Start Elasticsearch, the node will look for the same cluster name and courseware of the master node, if there is joined, did not become the master node, responsible for the discovery of the module two purposes
Selecting the master node and discovering the new node of the cluster
Types of 7.1.1 Discoveries
Elasticsearch allows the use of Zen discovery, in the config inside the elasticsearch.yml to configure the Zen information can be used when the 2.1.0 is the case
7.1.2 Master Node
One of the functions found is to select the primary node, which should be completed by zookeeper, and maintain a corresponding
Configure Master and Data nodes: Elasticsearch allows nodes to be the primary node and data node at the same time, but can be set according to their own requirements, such as only the master node, not to do the number
Data nodes, etc.
Primary node selected configuration: Brain fissure is like a cluster of 10 nodes, there are three disconnected, then three become a cluster, so a total of two clusters, become a brain fissure, in order to avoid this
Situation, you need to set Zen.minimum_master_nodes, set to +1 of the total number of 50% to become a cluster, you can prevent this happens
7.1.3 Setting the cluster name
Set the value of Cluster.name inside the elasticsearch.yml file
Configure multicast: Multicast is the default method for Zen discovery, in addition to the common settings, you can also control the group address, multicast communication port number, the time the multi-dial request is considered valid, and
Elasticsearch the address that should be bound
Configure unicast: You need to configure IP and port numbers to be discovered by the cluster
7.1. Ping settings for 4 nodes
Can control or change the default ping settings, ping is the signal sent between the nodes to detect whether to run, can set the time interval, number of repetitions, and the default wait time, etc.
7.2 Gate of time and reply module
Some important information, such as index, index data and other information needs to be persisted to another place, the inside in the event of an accident recovery when read, this time the role of the door
7.2.1 the gate of Time
The gate of time type can be added to the elasticsearch.yml inside the Gateway.type property, set to local, the default type is to store the index in the local file system
7.2.2 Recovery Control
You can set when to start the recovery process, such as a total of 10 nodes, recover after eight nodes, or restore such a setting after eight minutes
Additional Gateway recovery options
Recover_after_master_nodes: Specifies how many nodes that are eligible to be the primary node will start recovering when they appear in the cluster
Recover_after_data_nodes: Specifies how many data nodes start recovery only when an urgent crowd is present
7.3 Preparing Elasticsearch clusters for high query and high index throughput
How tuning enables clusters to handle high queries and high index throughput
7.3.1 Filter Cache
Filter caching can improve query speed, especially those that contain filters that have already been executed, including two categories, the node filter cache, the default, and the other
The first cache can be set to use a specific size of memory or as a percentage of total memory allocated to Elasticsearch
7.3.2 field data cache and circuit breaker
field data caching: When a query performs a sort or slice on a field, Elasticsearch loads the field's data into memory for quick access and can set
Indices.fileddata.cache.size Property control, which can be set to absolute or percentage, is node level, or indices.filddata.cache.expire can be used to
Control sets the maximum inactivity time, but is not set in general, because rebuilding the field data cache is very expensive
Circuit breaker: Allows estimating the memory required for a field to be loaded into the cache, preventing the field from loading memory by throwing an exception, causing memory explosion, having two properties, the first of which is
Indicies.filddata.breaker.limit property, default is 80%, the second is indices.fielddata.breaker.overhead default is 1.03
7.3.3 Storage Module
The enclosure controls how the index data is written, can be stored in memory or on a persistent disk, memory is fast but unstable, indexing is slow, but tolerant of failures
What type of storage to use with Index.store.type
SIMPLESF: Disk-based storage, poor performance for concurrent access
Niofs: Use the Javanio class to access the index file, high concurrency performance is good, but no longer under Windows
MMAPFS: Disk-based storage, in-memory mapping of index files, 64 as the default storage under the system, read operations faster, but to have a sufficient number of virtual address space
Memory: There must be enough memory for the index to exist, or it will fail
7.3.4 index buffer and refresh rate
Elasticsearch allows you to set the maximum number of memory indexed, such as the set percentage, and the indices.memory.min_shard_index_buffer_size default is 4MB, for each sub-
Minimum index buffer for the slice set
Index refresh rate: Index.refresh_interval property, specify how often to refresh, default is 1s, the smaller the value, the shorter the time, also means that the index and search will be slow, in the reconstruction of the cable
It is recommended to set it to-1
7.3.5 Thread Pool Configuration
Elasticsearch open thread pool type has, cache unlimited thread pool, fixed size thread pool, size set
The important thread pool has the following several:
Index: Used for indexing and deleting operations, default is Fiexed, the number of available processors by default is 300
Search: For searching and counting requests, the default is Fixed,size is the number of available processors multiplied by 3, and the queue size defaults to 1000
Suggest: A request for a proposal that defaults to fixed,size as the number of available processors, the size of the queue defaults to 1000
Get: For real-time GET request, fixed, queue default size is 50
Bulk: For bulk operations, fixed,size defaults to the number of available processors (this sentence should be understood), the queue size defaults to 50
The thread pool configuration can be set up in the Yml file, or it can be updated using the update API of the cluster, given an example here
7.3.6 together, some general recommendations
Choose the right storage: choose MMAPFS at 64-bit, if not 64-bit, select Niofs for UNIX system, select Simplefs for Windows
Index refresh rate: The faster the refresh rate, the slower the query and the lower the index throughput
Optimizing the thread pool: it is strongly recommended to adjust the default thread pool, especially for query operations
Optimize the merge process: The query needs fewer segments faster, the index needs more segments quickly, when the merger needs its own choice according to the situation
field data caching and circuit breakers: Limit field data cache size, set up circuit breakers, and combine to ensure that memory issues are not encountered
Indexed memory buffers: the more memory buffers in memory, the more documents can be stored inside, you can set 10%-30%
Optimize transaction log: Elasticsearch internal Translog module, save up to 5,000 operations by default, maximum not more than 200MB, if you want to improve the index throughput, you can also commit
The data is not visible to the search operation longer, you can increase this default value, if the increase may be done index to the search courseware takes a longer time
7.4 Templates and dynamic templates
Index template functionality, do not need to create mappings every time, etc.
An example of a template: Give an example of a template
Storing templates in Files: You can store templates in the Config/templates directory
7.4.2 Dynamic Templates
Give an example of a dynamic template, two matching modes, match mode using the template, Unmatch mode using the template
More please click: http://blog.csdn.net/molong1208?viewmode=contents
"Elasticsearch" deep into Elasticsearch cluster