http://nkcoder.github.io/blog/20141106/elkr-log-platform-deploy-ha/
1. Architecture for highly available scenarios
In the previous article using Elasticsearch+logstash+kibana+redis to build a log management service describes the overall framework of log services and the deployment of various components, this article mainly discusses the Log service framework of high-availability scenarios, mainly from the following three aspects of consideration: As a broker Redis, you can use Redis cluster or the primary and standby structures instead of single instances to increase the availability of broker components, and as a logstash of indexer, you can deploy multiple Logstash instances, collaborate on log information, Improve the usability of indexer components, and use Elasticsearch Cluster as the elasticsearch of Search&storage to improve the performance and usability of search&storage components;
A schematic diagram of a high-availability scenario for the log service is:
The following are presented separately. 2. Redis Cluster and master-and-standby structure 2.1 Redis Cluster
First we need to deploy a Redis cluster, for convenience, I deployed a three-master three-slave cluster on this machine, the ports are: 7000, 7001, 7002, 7003, 7004, 7005, port 7000 For example, the configuration file is:
Include: /redis.conf
daemonize Yes
pidfile/var/run/redis_7000.pid
port 7000
logfile/opt/logs/redis/7000. Log
appendonly Yes
cluster-enabled Yes
cluster-config-file node-7000.conf
For Redis, both the remote Logstash and the central Logstash are the clients of the Redis cluster, so you only need to connect to any one of the cluster nodes. The Redis configuration section of the remote Logstash and center Logstash is:
Shipper.conf:
Output {
Redis {
host = "20.8.40.49:7000"
data_type = "List"
key = "Key_count"
}
}
Central.conf:
Input {
Redis {
host = "20.8.40.49"
port = 7000
type = "Redis-cluster-input"
data_type = "List"
key = "Key_count"
}
}
Advantages and disadvantages of using Redis cluster: Can improve the availability of broker components: When each master node has a slave node, any node hangs, does not affect the cluster's normal operation If Cluster-require-full-coverage No is enabled, the subset of active nodes is still available. Redis cluster provides good extensibility when redis as a broker component becomes a bottleneck. But Redis cluster has a headache problem, that is, in scaling (adding and removing nodes), need to do sharding manually, the official REDIS-TRIB.RB tools, I have implemented a Java version, can be used as a reference redis-toolkit. The current Redis cluster is still in version RC1, and the stable version has to wait a while. 2.2 Redis Primary and standby structure
Note that the primary is not master and slave, the standby Redis instance is just as a redundant node, when the primary node hangs, the alternate node is overhead, and at any moment only one node is serving. Both the remote Logstash and the central Logstash need to explicitly configure the information for all the primary and standby Redis nodes, Logstash Polls the node list and selects an available node. For example, configuring two Redis instances, 6379 as the primary, and 6380 as Slave, the configuration of the remote Logstash and central Logstash are:
Shipper.conf:
Output {
Redis {
host = = ["20.8.40.49:6379", "20.8.40.49:6380"]
data_type = "List"
key = " Key_count "
}
}
Central.conf:
Input {
Redis {
host = "20.8.40.49"
port = 6379
type = "Redis-cluster-input"
Data_ Type = "List"
key = "Key_count"
}
redis {
host = "20.8.40.49"
port = 6380
type = "Redis-cluster-input"
data_type = "List"
key = "Key_count"
}
}
The pros and cons of the Redis Master architecture: The availability of broker components can be improved: as long as one node in the primary and standby node is available, the Broker component service is available, and the primary and standby structure does not solve the problem of Redis becoming a bottleneck; 3. Deploying multiple Centers Logstash
When the amount of log information is very large, as the indexer Logstash is likely to become a bottleneck, at this time, you can deploy multiple central Logstash, the relationship between them is equivalent, together with the extraction of messages from the broker, the central Logstash nodes are independent of each other. The configuration of each central Logstash node is exactly the same, for example, when the broker is a redis cluster, the central Logstash is configured as:
Central.conf:
Input {
Redis {
host = "20.8.40.49"
port = 7000
type = "Redis-cluster-input"
data_type = "List"
key = "Key_count"
}
}
The pros and cons of deploying multiple centers Logstash: can improve the usability of indexer components: multiple central Logstash nodes are independent of each other, the failure of any one node will not affect the other nodes, and will not affect the entire indexer component; When broker is Redis , each hub Logstash is the client of Redis, both of which execute the blpop command to extract messages from Redis, and any single command of Redis is atomic, so a multi-center logstash not only improves the availability of indexer components, It can also improve the processing capacity and efficiency of indexer components, and the deployment and maintenance cost of multi-center Logstash nodes can be improved by using configuration management tools such as Puppet, Saltstack, etc. 4. ElasticSearch Cluster
ElasticSearch native supports cluster mode, which communicates between nodes via unicast or multicast, and ElasticSearch cluster automatically detects node additions, failures, and recoveries, and reorganize indexes.
For example, we launch two Elasticsearch instances to form a cluster, using the default configuration, such as:
$ bin/elasticsearch-d
$ bin/elasticsearch-d
With the default configuration, the HTTP listening ports for two instances are 9200 and 9201, respectively, and their communication ports are 9300 and 9301 respectively, and the cluster is formed by using multicast by default, and the name of the cluster is elasticsearch;
Center Logstash only need to configure the Elasticsearch cluster name (if you cannot find the ES cluster, you can specify Host:host = "20.8.40.49"), such as:
Output {
Elasticsearch {
cluster = "Elasticsearch"
codec = "JSON"
protocol = "http"
}< c5/>}
Pros and cons of using Elasticsearch cluster: Improve component availability: Any node in the cluster is hung up, indexes and replicas are automatically reassigned; excellent horizontal extensibility: Add new nodes to cluster and the indexes will be automatically re-organized; 5. Follow-up work
About ELKR Log service, the next step is: Learning grok regular expression, matching the custom log output format, research Elasticsearch query function and principle, familiar with Kibana rich icon display function;
Ok, the introduction of the high-availability program is over, if used in the actual production environment, should encounter many unexpected problems, follow-up will continue to summarize and share.