Elasticsearch Distributed Search configuration file

Source: Internet
Author: User

Elasticsearch is an open source distributed real-time search and analysis engine that supports cloud services. It is based on the Apache Lucene search engine's class library and provides full-text search capabilities, multi-lingual support, a dedicated query language, support for geolocation services, context-based search suggestions, AutoComplete, and search fragments (snippet) capabilities. Elasticsearch supports restful APIs and can use JSON to invoke its various functions via HTTP, including search, analysis, and monitoring. The following is a description of the Elasticsearch distributed search configuration files of various parameters of the specific meaning.


Elasticsearch Config folder has two configuration files: Elasticsearch.yml and Logging.yml, the first is the basic profile of ES, the second is the log configuration file, ES is also used log4j to record the log, so logging.yml Settings are set according to the normal log4j configuration file. The following is the main explanation of the Elasticsearch.yml this file can be configured things.

Cluster.name:elasticsearch

Configure the ES cluster name, by default elasticsearch,es will automatically discover ES in the same network segment, if there are multiple clusters under the same network segment, you can use this attribute to distinguish different clusters.

Node.name: "Franz Kafka"

Node name, by default randomly specifies a name in the name list, which is in the Name.txt file in the Config folder in the ES jar package, which has many interesting names added by the author.

Node.master:true

Specifies whether the node is eligible to be elected node, by default True,es is the first machine in the default cluster as master, and if this machine hangs, it will be re-elected master.

Node.data:true

Specifies whether the node stores index data, which is true by default.

Index.number_of_shards:5

Sets the default index number of shards, which defaults to 5 slices.

Index.number_of_replicas:1

Sets the default number of index replicas, which defaults to 1 copies.

Path.conf:/path/to/conf

Sets the storage path of the configuration file, which is the Config folder under the ES root directory by default.

Path.data:/path/to/data

Set the storage path of the index data, the default is the Data folder in the ES root directory, you can set multiple storage paths, separated by commas, for example:

Path.data:/path/to/data1,/path/to/data2

Path.work:/path/to/work

Set the storage path for temporary files, which is the work folder in the ES root directory by default.

Path.logs:/path/to/logs

Set the storage path for the log file, which is the logs folder in the ES root directory by default

Path.plugins:/path/to/plugins

Set the storage path of the plug-in, by default the plugins folder in the ES root directory

Bootstrap.mlockall:true

Set to True to lock the memory. Because ES is inefficient when the JVM starts to swapping, make sure it does not swap, set the ES_MIN_MEM and ES_MAX_MEM environment variables to the same value, and ensure that the machine has enough memory allocated to ES. Also allow the Elasticsearch process to lock the memory, Linux can be through the ' ulimit-l Unlimited ' command.

network.bind_host:192.168.0.1

Sets the IP address of the binding, which can be either IPv4 or IPv6, which defaults to 0.0.0.0.

network.publish_host:192.168.0.1

Set the other node and the IP address of the node interaction, if not set it will automatically determine that the value must be a real IP address.

network.host:192.168.0.1

This parameter is used to set both Bind_host and Publish_host above two parameters.

transport.tcp.port:9300

Set the TCP port for interaction between nodes, which is 9300 by default.

Transport.tcp.compress:true

Sets whether to compress the data when TCP is transmitted, by default, false, not compressed.

http.port:9200

Sets the HTTP port for the external service, which defaults to 9200.

http.max_content_length:100mb

Set the maximum capacity of content, default 100MB

Http.enabled:false

Whether to use the HTTP protocol to provide services externally, the default is true, open.

Gateway.type:local

The type of gateway, default to local file system, can be set to local file system, Distributed File System, Hadoop HDFs, and Amazon S3 Server, other file system Setup method next time.

Gateway.recover_after_nodes:1

Sets the data recovery at the start of N nodes in the cluster by default of 1.

Gateway.recover_after_time:5m

Sets the time-out for initializing the data recovery process, which is 5 minutes by default.

Gateway.expected_nodes:2

Set the number of nodes in this cluster, the default is 2, once the N nodes are started, data recovery will be done immediately.

Cluster.routing.allocation.node_initial_primaries_recoveries:4

When initializing data recovery, the number of concurrent recovery threads is 4 by default.

Clusterouting.allocation.node_concurrent_recoveries:2

The number of concurrent recovery threads when adding a delete node or load balancer defaults to 4.

indices.recovery.max_size_per_sec:0

Set the bandwidth limit for data recovery, such as 100MB, which defaults to 0, which is unlimited.

Indices.recovery.concurrent_streams:5

Set this parameter to limit the number of concurrent streams to open when recovering data from other shards by default of 5.

Discovery.zen.minimum_master_nodes:1

Set this parameter to ensure that the nodes in the cluster can know the other N nodes that have a master qualification. The default is 1, for large clusters, you can set a larger value (2-4)

Discovery.zen.ping.timeout:3s

Sets the ping connection time-out when the other nodes are automatically discovered in the cluster, which defaults to 3 seconds, and an error in preventing Autodiscover when the value is higher than the poor network environment.

Discovery.zen.ping.multicast.enabled:false

Sets whether to open the multicast Discovery node, which is true by default.

Discovery.zen.ping.unicast.hosts: ["host1", "Host2:port", "Host3[portx-porty]"]

Sets the initial list of master nodes in the cluster, which can be used to automatically discover the nodes that are newly joined to the cluster.

The following are the slow log parameter settings for some queries, as follows

Index.search.slowlog.level:TRACEindex.search.slowlog.threshold.query.warn: 10sindex.search.slowlog.threshold.query.info:5sindex.search.slowlog.threshold.query.debug: 2sindex.search.slowlog.threshold.query.trace:500ms Index.search.slowlog.threshold.fetch.warn: 1sindex.search.slowlog.threshold.fetch.info:800msindex.search.slowlog.threshold.fetch.debug : 500msindex.search.slowlog.threshold.fetch.trace:200ms

Source: http://www.yoodb.com/

Elasticsearch Distributed Search configuration file

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.