ElasticSearch configuration example and elasticsearch example

Last Update:2017-02-14 Source: Internet

Author: User

Tags configuration settings

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

ElasticSearch configuration example and elasticsearch example

##################### ElasticSearch configuration example ################ #####

# This file contains an overview of various configuration settings,
# Targeted at operations staff. Application developers shoshould
# Consult the guide.
# This file contains an overview of various configurations. It is designed to configure items related to running operations.
# Application developers should consult
#
# The installation procedure is covered
#.
# The installation process is provided here.
#
#
# ElasticSearch comes with reasonable defaults for most settings,
# So you can try it out without bothering with configuration.
# ElasticSearch provides most of the settings, which are reasonable default configurations.
# Try it without any annoying configuration.
#
# Most of the time, these defaults are just fine for running a production
# Cluster. If you're fine-tuning your cluster, or wondering about
# Effect of certain configuration option, please _ do ask _ on
# Mailing list or IRC channel [http://elasticsearch.org/community].
# Most of the time, these default configurations are enough to run a production cluster.
# If you want to optimize your cluster or be curious about the role of a specific configuration option, you can access the mail list
# Or IRC channel [http://elasticsearch.org/community].
#

# Any element in the configuration can be replaced with environment variables
# By placing them in $ {...} notation. For example:
# Any element in the configuration can be replaced by environment variables. These environment variables use the placeholder symbol $ {...}.
# Example:
# Node. rack :$ {RACK_ENV_VAR}

# See
# For information on supported formats and syntax for the configuration file.
# View more
# Supported formats and Configuration File Syntax.

################################### Cluster #### ###############################

# Cluster name identifies your cluster for auto-discovery. If you're running
# Multiple clusters on the same network, make sure you're using unique names.
# The cluster name identifies your cluster and will be used for automatic probe.
# If you run multiple clusters in the same network, make sure that your cluster name is unique.
#
# Cluster. name: elasticsearch

#################################### Node ### ##################################

# Node names are generated dynamically on startup, so you're relieved
# From processing ing them manually. You can tie this node to a specific name:
# The node name is automatically generated at startup, so you do not need to manually configure it. You can also specify
# Specific name
#
# Node. name: "Franz Kafka"

# Every node can be configured to allow or deny being eligible as the master,
# And to allow or deny to store the data.
# You can configure whether each node is allowed to be elected as the master node and whether data can be stored.
#
#
# Allow this node to be eligible as a master node (enabled by default ):
# Allow this node to be elected as a master node (default value: allow)
#
#
# Node. master: true
#
# Allow this node to store data (enabled by default ):
# Allow this node to store data (which is allowed by default)
#
# Node. data: true

# You can exploit these settings to design advanced cluster topologies.
# You can use these settings to design an advanced cluster topology.
#
#1. You want this node to never become a master node, only to hold data.
# This will be the "workhorse" of your cluster.
#1. You don't want this node to become a master node and only want to store data.
# This node will become the "Load balancer" of your cluster"
#
# Node. master: false
# Node. data: true
#
#2. You want this node to only serve as a master: to not store any data and
# To have free resources. This will be the "coordinator" of your cluster.
#2. You want to make this node A master node and do not need to store any data and have free resources.
# This node will become the "coordinator" in your cluster"
#
# Node. master: true
# Node. data: false
#
#3. You want this node to be neither master nor data node,
# To act as a "search load balancer" (fetching data from nodes,
# Aggregating results, etc .)
#4. You do not want to turn this node into a master node or a data node, just want to make it a "Search Server Load balancer"
# (Obtain data from nodes, aggregate results, and so on)
#
# Node. master: false
# Node. data: false

# Use the Cluster Health API [http: // localhost: 9200/_ cluster/health],
# Node Info API [http: // localhost: 9200/_ cluster/nodes] or GUI tools
# Such as and
# To inspect the cluster state.
# Use the cluster health check API [http: // localhost: 9200/_ cluster/health],
# Node information API [http: // localhost: 9200/_ cluster/nodes] Or GUI tool example:
# And
# View the cluster status
#

# A node can have generic attributes associated with it, which can later be used
# For customized shard allocation filtering, or allocation awareness. An attribute
# Is a simple key value pair, similar to node. key: value, here is an example:
# A node can contain some common attributes that can be used in the Custom partition allocation filtering or allocation awareness.
# An attribute is a simple key-value pair, similar to node. key: value. Here is an example:
#
# Node. rack: rack314

# By default, multiple nodes are allowed to start from the same installation location
# To disable it, set the following:
# By default, multiple nodes can be started from the same installation location. To disable this feature, configure it as follows:
# Node. max_local_storage_nodes: 1
#

#################################### Index ### #################################

# You can set a number of options (such as shard/replica options, mapping
# Or analyzer definitions, translog settings,...) for indices globally,
# In this file.
# You can set a series of global operations for all indexes in this file (such as partition/copy operations, mapping)
# Or analyzer definition, translog configuration ,...)
#
#
# Note, that it makes more sense to configure index settings specifically
# A certain index, either when creating it or by using the index templates API.
# Tips: it is more reasonable to configure a specific index, whether it is when creating an index or using the index template API.
#
#
# See and
#
# For more information.
# For details, see and
#

# Set the number of shards (splits) of an index (5 by default ):
# Set the number of shards for an index (5 by default)
#
# Index. number_of_shards: 5

# Set the number of replicas (additional copies) of an index (1 by default ):
# Set the number of copies of an index (1 by default)
#
# Index. number_of_replicas: 1

# Note, that for development on a local machine, with small indices, it usually
# Makes sense to "disable" the distributed features:
# Note: To use a small index for development on a local machine, it is reasonable to disable the distributed feature.
#
#
# Index. number_of_shards: 1
# Index. number_of_replicas: 0

# These settings directly affect the performance of index and search operations
# In your cluster. Assuming you have enough machines to hold shards and
# Replicas, the rule of thumb is:
# These settings directly affect the performance of indexing and query operations in the cluster. If you have enough machines to store parts and copies,
# Best practices:
#
#1. Having more * shards * enhances the _ indexing _ performance and allows
# _ Distribute _ a big index within SS machines.
#1. More index shards can improve the index performance and distribute a large index to the machine.
#2. Having more * replicas * enhances the _ search _ performance and improves
# Cluster _ availability _.
#2. More replica shards can improve search performance and cluster availability.
#
# The "number_of_shards" is a one-time setting for an index.
# "Number_of_shards" can be configured only once for an index
#
# The "number_of_replicas" can be increased or decreased anytime,
# By using the Index Update Settings API.
# "Number_of_replicas" can be increased or decreased at any time. This can be done through the Index Update Settings (Index Update Configuration) API.
#
#
# ElasticSearch takes care about load balancing, relocating, gathering
# Results from nodes, etc. Experiment with different settings to fine-tune
# Your setup.
# ElasticSearch maintains load balancin (load balancing), relocating (relocation), and merges results from various nodes.
# You can experiment with different configurations for optimization.
#

# Use the Index Status API () to inspect
# The index status.
# Use the Index Status API () to view the Index Status

################################### Paths (PATH) ####################################

# Path to directory containing configuration (this file and logging. yml ):
# Path of the directory containing the configuration (this file and logging. yml)
#
# Path. conf:/path/to/conf

# Path to directory where to store index data allocated for this node.
# Directory path for storing the index data of this node
#
# Path. data:/path/to/data
#
# Can optionally include more than one location, causing data to be striped into SS
# The locations (a la RAID 0) on a file level, favoring locations with most free
# Space on creation. For example:
# You can include at will more than one location, so that the data will span multiple locations on the file layer (a la RAID 0 ).
# Prioritize the location of large remaining space
#
# Path. data:/path/to/data1,/path/to/data2

# Path to temporary files:
# Path of the temporary file
#
# Path. work:/path/to/work

# Path to log files:
# Log File Path
#
# Path. logs:/path/to/logs

# Path to where plugins are installed:
# Plugin installation path
#
# Path. plugins:/path/to/plugins

#################################### Ins ### ################################

# If a plugin listed here is not installed for current node, the node will not start.
# If the current node does not install the plug-ins listed below, the node will not start
#
# Plugin. mandatory: mapper-attachments, lang-groovy

################################### Memory #### ################################

# ElasticSearch performs poorly when JVM starts swapping: you shoshould ensure that
# It _ never _ swaps.
# When the JVM starts swapping (page feed), ElasticSearch performance will be low. You should ensure that it will not change pages
#
#
# Set this property to true to lock the memory:
# Set this attribute to true to lock the memory
#
# Bootstrap. mlockall: true

# Make sure that the ES_MIN_MEM and ES_MAX_MEM environment variables are set
# To the same value, and that the machine has enough memory to allocate
# For ElasticSearch, leaving enough memory for the operating system itself.
# Make sure that the ES_MIN_MEM and ES_MAX_MEM environment variables are set to the same value to ensure that the machine has enough memory to allocate
# ElasticSearch with enough memory reserved for the operating system
#
#
# You shoshould also make sure that the ElasticSearch process is allowed to lock
# The memory, eg. by using 'ulimit-l unlimited '.
# Make sure that the ElasticSearch process can lock the memory. For example, use 'ulimit-l unlimited'
#

############################ Network) and HTTP ###############################

# ElasticSearch, by default, binds itself to the 0.0.0.0 address, and listens
# On port [9200-9300] for HTTP traffic and on port [9300-9400] for node-to-node
# Communication. (the range means that if the port is busy, it will automatically
# Try the next port ).
# By default, ElasticSearch binds itself to the 0.0.0.0 address. The listening port for HTTP transmission is between [9200-9300] and nodes.
# The communication port is in [9300-9400]. (Range means that if a port is occupied, it will automatically try the next port)
#
#

# Set the bind address specifically (IPv4 or IPv6 ):
# Set a specific binding address (IPv4 or IPv6 ):
#
# Network. bind_host: 192.168.0.1

# Set the address other nodes will use to communicate with this node. If not
# Set, it is automatically derived. It must point to an actual IP address.
# Set the address for other nodes to communicate with this node. If this parameter is not set, it is automatically obtained.
# It must be a real IP address.
#
# Network. publish_host: 192.168.0.1

# Set both 'Bind _ host' and 'Publish _ host ':
# Set both 'Bind _ host' and 'Publish _ host'
#
# Network. host: 192.168.0.1

# Set a custom port for the node to node communication (9300 by default ):
# Set a custom port for communications between nodes (9300 by default)
#
# Transport. tcp. port: 9300

# Enable compression for all communication between nodes (disabled by default ):
# Enable compression for communications between all nodes (disabled by default)
#
# Transport. tcp. compress: true

# Set a custom port to listen for HTTP traffic:
# Set a custom port for listening for HTTP Transmission
#
# Http. port: 9200

# Set a custom allowed content length:
# Set a custom allowed Content Length
#
# Http. max_content_length: 100 mb

# Disable HTTP completely:
# Completely disable HTTP
#
# Http. enabled: false

################################### Gateway #### ###############################

# The gateway allows for persisting the cluster state between full cluster
# Restarts. Every change to the state (such as adding an index) will be stored
# In the gateway, and when the cluster starts up for the first time,
# It will read its state from the gateway.
# Gateway supports persistent cluster status. Every change in the status (for example, adding an index) will be stored in the gateway,
# When the cluster is started for the first time, it reads its status from the gateway.
#

# There are several types of gateway implementations. For more information,
# See.
# There are multiple types of gateway implementations. For details, see

# The default gateway type is the "local" gateway (recommended ):
# The default gateway type is "local" gateway (recommended)
#
# Gateway. type: local

# Settings below control how and when to start the initial recovery process on
# A full cluster restart (to reuse as much local data as possible when using shared
# Gateway ).
# The following configuration controls how and when to start the initialization and recovery process of a whole cluster restart
# (When using the shard gateway, it is to reuse local data (local data) as much as possible ))
#

# Allow recovery process after N nodes in a cluster are up:
# Recovery is allowed only after N nodes in a cluster are started.
#
# Gateway. recover_after_nodes: 1

# Set the timeout to initiate the recovery process, once the N nodes
# From previous setting are up (accepts time value ):
# Set the timeout value for the initialization recovery process. The timeout value is counted from the start time of the N nodes configured in the previous configuration.
#
# Gateway. recover_after_time: 5 m

# Set how many nodes are expected in this cluster. Once these N nodes
# Are up (and recover_after_nodes is met), begin recovery process immediately
# (Without waiting for recover_after_time to expire ):
# Set the desired number of nodes in the cluster. Once the N nodes are started (and recover_after_nodes also meets ),
# Start the recovery process immediately (do not wait for recover_after_time to time out)
#
# Gateway. expected_nodes: 2

########################### Recovery Throttling) #############################

# These settings allow to control the process of shards allocation
# Nodes during initial recovery, replica allocation, rebalancing,
# Or when adding and removing nodes.
# These configurations allow you to control the partition allocation between nodes during initialization recovery, replica allocation, and rebalancing, or when adding or deleting nodes
#

# Set the number of concurrent recoveries happening on a node:
# Set the number of parallel restores for a node
#
#1. During the initial recovery
#1. during initialization and recovery
#
# Cluster. routing. allocation. node_initial_primaries_recoveries: 4
#
#2. During adding/removing nodes, rebalancing, etc
#2. Adding/deleting nodes and rebalancing
#
# Cluster. routing. allocation. node_concurrent_recoveries: 2

# Set to throttle throughput when recovering (eg. 100 mb, by default unlimited ):
# Set the recovery throughput (for example, 100 mb, no upper limit by default)
#
# Indices. recovery. max_size_per_sec: 0

# Set to limit the number of open concurrent streams when
# Recovering a shard from a peer:
# Set the maximum number of concurrent streams that can be opened when a shard is restored from the peer.
#

# Indices. recovery. concurrent_streams: 5

################################## Discovery (PROBE) ##################################

# Discovery infrastructure ensures nodes can be found within a cluster
# And master node is elected. Multicast discovery is the default.
# The probe mechanism ensures that nodes in a cluster can be found and the master node can be elected.
# The default method is multicasting.

# Set to ensure a node sees N other master eligible nodes to be considered
# Operational within the cluster. Set this option to a higher value (2-4)
# For large clusters (> 3 nodes ):
# This option is used to set one node. You can see N other nodes that are operable in the cluster and are eligible to be elected as the master node.
# For a large cluster (greater than three nodes), this option should be set to a higher value (2-4)
#
# Discovery. zen. minimum_master_nodes: 1

# Set the time to wait for ping responses from other nodes when discovering.
# Set this option to a higher value on a slow or congested network
# To minimize discovery failures:
# Set the wait time for the ping response returned from other nodes during the probe Process
# This option should be set to a greater value in a low-speed or congested network environment, which can reduce the possibility of probing failure.
#
# Discovery. zen. ping. timeout: 3 s

# See # For more information.
# For more information, see

# Unicast discovery allows to explicitly control which nodes will be used
# To discover the cluster. It can be used when multicast is not present,
# Or to restrict the cluster communication-wise.
# Using unicast probe, we can display the nodes that will be used during cluster probe.
# You can use unicast probe when multicast is unavailable or the cluster communication needs to be restricted.
#
#1. Disable multicast discovery (enabled by default ):
#1. Disable multicast profiling (available by default)
#
# Discovery. zen. ping. multicast. enabled: false
#
#2. Configure an initial list of master nodes in the cluster
# To perform discovery when new nodes (master or data) are started:
#2. This is the initial list of the master nodes in a cluster. When the node (master node or data node) is started, use this list for exploration.
#
#
# Discovery. zen. ping. unicast. hosts: ["host1", "host2: port", "host3 [portX-portY]"]

# EC2 discovery allows to use AWS EC2 API in order to perform discovery.
# To perform a probe EC2 probe, AWS EC2 APIs are allowed
#
# You have to install the cloud-aws plugin for enabling the EC2 discovery.
# To enable EC2 probe, you must install the cloud-aws plug-in
#
# See # For more information.
# For more information, see #
#
# See # For a step-by-step tutorial.
# For more information, see

################################## Slow Log (Slow Log) ##################################

# Shard level query and fetch threshold logging.
#

# Index. search. slowlog. threshold. query. warn: 10 s
# Index.search.slowlog.threshold.query.info: 5S
# Index. search. slowlog. threshold. query. debug: 2 s
# Index. search. slowlog. threshold. query. trace: 500 ms

# Index. search. slowlog. threshold. fetch. warn: 1 s
# Index.search.slowlog.threshold.fetch.info: 800 ms
# Index. search. slowlog. threshold. fetch. debug: 500 ms
# Index. search. slowlog. threshold. fetch. trace: 200 ms

# Index. indexing. slowlog. threshold. index. warn: 10 s
# Index.indexing.slowlog.threshold.index.info: 5S
# Index. indexing. slowlog. threshold. index. debug: 2 s
# Index. indexing. slowlog. threshold. index. trace: 500 ms

################################## GC Logging #### ############################

# Monitor. jvm. gc. ParNew. warn: 1000 ms
# Monitor.jvm.gc.ParNew.info: 700 ms
# Monitor. jvm. gc. ParNew. debug: 400 ms

# Monitor. jvm. gc. ConcurrentMarkSweep. warn: 10 s
# Monitor.jvm.gc.ConcurrentMarkSweep.info: 5S
# Monitor. jvm. gc. ConcurrentMarkSweep. debug: 2 s

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More