Redis China User Group | Cluster Redis mass Production practice

Source: Internet
Author: User
Tags failover redis cluster

Guest: Agronomy

I am delighted to have the opportunity to share with you the Redis cluster production practices in the Redis China User group. At present, the Redis/hbase is mainly responsible for the development and support of the operation, and also participates in the tool development work.

Outline one, production application scenario two, storage architecture Evolution III. Application Best Practice Iv. Summary of operational and maintenance experience


1th, 2: Introduce the Redis cluster production application scenario, and the evolution of the storage architecture.
3rd: The stability of Redis cluster, application maturity, stepping over those pits, how to solve these problems? This part is the content that everybody cares more.
Section 4th: A brief introduction to some of the experience of large-scale operations, including deployment, monitoring, management, and Redis tool development.

First, production application Scenario 1, business scope

Redis cluster is primarily used in the back-end business as a memory storage service. Main Big Data real-time recommendation/etl, wind control, marketing three great cause use. Cluster is used to replace the current Twemproxy three-tier architecture as a common storage architecture. Redis cluster can greatly simplify our storage architecture and also solve the problem that twemproxy architectures cannot scale nodes online. At present, we have online production of dozens of cluster cluster, about 2000 instances, a single cluster maximum reached 250+instances.
This is our production application scenario, primarily the storage of back-end services, and is not currently used as a cache scenario.

2. Characteristics of big data, wind control and marketing system
    • Cluster as a large amount of data, a single cluster cluster has a memory storage capacity of dozens of GB to up to the upper TB level.
    • As a storage for back-end applications, the data sources are mainly in the following three ways:

      • Kafka-to-Redis cluster,storm/spark real-time
      • Hive-to-Redis Cluster, mapreduce program
      • MySQL---Redis cluster,java/c++ program.
    • The data is generated by the offline/real-time job, the volume of Read and write requests is high, and the read and write performance.

    • The peak demand for business spikes, and the number of reads and writes increased several times, requiring multiple Redis instances to bear the business's read and write pressures.
    • Business requirements change rapidly and schema changes frequently. If you use MySQL as storage, it will be a frequent DLL change, and you need to do the online schema changes.
    • Large promotions with frequent expansion.
3. Why choose Redis cluster3.1 cluster for our back-end production applications
    • Online horizontal expansion capability, can solve our large-scale expansion demand.
    • Failover capabilities and high availability.
    • Although cluster does not guarantee strong consistency of master-slave data, the backend service can tolerate a small amount of data loss after failover.
3.2 Simple Architecture
    • No center architecture, each node degree and so on. The slave node provides data redundancy, which is promoted to master when the master node is abnormal.
    • Replaces the Twemproxy three-tier architecture with reduced system complexity.
    • Can save a lot of hardware resources, our LVS + twemproxy layer uses nearly thousands of physical machines.
    • With the lack of LVS and twemproxy layer, read and write performance improved significantly. The response time is reduced from 100-200us to 50-100US.
    • Fewer system bottlenecks. LVS Layer network card and PPS throughput bottleneck, twemproxy single node performance is low for business with large request length.
      In conclusion, we chose the Redis cluster main reason for this two: simplicity, extensibility. In addition, we use cluster to replace the Twemproxy cluster, the three-tier architecture is really a headache, complex, bottleneck, management not aspects.
II. Storage Architecture Evolution 1, Architecture evolution

In July 2014, we migrated a single Redis service to Twemproxy in order to prepare for the 814-day big promotion at the time. Twemproxy quickly completes data fragmentation and expansion at the backend. To avoid re-scaling, we statically allocate enough resources.
After that, Twemproxy exposed a lot of system bottlenecks, the use of resources, there is a certain waste. We decided to replace this complex three-tier architecture with Redis cluster.
After Redis cluster GA, we started using it online. It was originally version 3.0.2, followed by a large number of 3.0.3, and began using the 3.0.7 version last month.

Below is a simple comparison of the two architectures, to resolve their pros and cons.

2. Twemproxy Architecture Advantages
    • Sharding logic is transparent to development, read and write in a way that is consistent with a single redis.
    • can act as a proxy for cache and storage (by Auto-eject).
Disadvantages
    • The architecture is complex and multi-layered. Includes LVs, Twemproxy, Redis, Sentinel, and its control layer programs.
    • Management costs and hardware costs are high.
    • 2 * 1Gbps network card of the LVS machine, the maximum can support 1.4 million pps.
    • High traffic system, the number of proxy nodes and the number of Redis close.
    • The Redis layer still has poor capacity for expansion, pre-allocating enough redis storage nodes.

This is the Twemproxy architecture, the client connects directly to the top LVs (LB), the second layer is isomorphic to the Twemproxy node, the following Redis master node, and the hot spare slave node, as well as the Independent sentinel cluster and the switching control program, Twemproxy first introduced here.

3. Redis Cluster Architecture Benefits
    • No central architecture.
    • Data is distributed across multiple Redis instances by slot storage.
    • Add slave to make standby data copies for failover, so that the cluster recovers quickly.
    • Implement the Fail auto failover. The state information is exchanged between the nodes through the gossip protocol, and the voting mechanism completes the slave to master role promotion.
    • You can also manual failover to provide operational solutions for upgrades and migrations.
    • Reduce hardware costs and operational costs, improve the scalability and availability of the system.
Disadvantages
    • The client implementation is complex, and the driver requires the smart client to cache slots mapping information and update it in a timely manner.
    • At present, only jediscluster relatively mature, the exception processing part is not perfect, such as the common "max redirect exception".
    • The client's immaturity, affects the application stability, enhances the development difficulty.
    • The node is blocked for some reason (blocking time is greater than clutser-node-timeout) and is judged to be offline. This failover is not necessary, and Sentinel also has this switching scenario.
      The cluster architecture is as follows:

Cluster.jpg

Only the master node (slave omitted) is on the graph, and all nodes form a full graph, and the slave node differs from master only in roles and functions in the cluster.

The evolution of the architecture is over, and the third part is the most interesting part of the story.

III. Application of best practices
    • How stable is the Redis cluster?
    • What pits exist?
    • Develop Guideline & best Practice
1. Stability
    • The cluster is very stable when not expanding.
    • When expanding resharding, there are sometimes "max-redirect" exceptions on the Jedis side of earlier versions.
      Analysis Jedis source code, the request retries reached the upper limit, still no successful request. Two-way analysis: Redis connection is not on? Or is the cluster node information inconsistent?
    • Defect of survival detection mechanism
      The Redis survival detection mechanism may be considered to be in the failed state and switch because of a slow query on the master node, blocking commands, or other performance issues that cause long periods of inactivity. This kind of switchover is not necessary. Optimization strategy: A) The default cluster-node-timeout is 15s, can be appropriately increased;
      b) Avoid the use of commands that cause prolonged blocking, such as blocking operations such as SAVE/FLUSHDB, or the slow query of the keys pattern .

In general, Redis cluster is already very stable, but pay attention to some of the small problems in the application, the following is the 5 pits, everyone noticed.

2. What are the pits? 2.1 Jedis "Max Redirect" exception during migration.
    • The result of the discussion on GitHub is program retry.
    • Max Redirt issues:https://github.com/xetorthio/jedis/issues/1238
    • Retry time should be greater than failover time.
    • Jedis parameter Optimization adjustment: Increase the ' default_max_redirections ' parameter in Jedis, the default value is 5.
    • Avoid using Multi-keys operations, such as Mset/mget. Multi-key operations Some clients do not support implementations.
2.2 Unnecessary failover caused by long-time blockage
    • The blocked command. Like Save/flushall/flushdb.
    • Slow query. Keys *, operation of large key, O (N) operation
    • Rename hazardous operation:
      • Rename-command Flushdb Redis_flushdb
      • Rename-command Flushall Redis_flushall
      • Rename-command KEYS Redis_keys
2.3 Support for IPv4 and IPv6 interception services buried pits

Specific phenomenon: Redis boot is normal, node protocol port only IPv6 socket created normal. The exception node cannot be added to the cluster, nor can it acquire the epoch.
WORKAROUND: Specify the NIC IPv4 address at startup, or 0.0.0.0, add the file: Bind 0.0.0.0
This is a problem that occurred during the setup cluster, and bind 0.0.0.0 has some security problems, but it is a relatively simple and common solution.

2.4 Slower data migration
    • The primary use of the REDIS-TRIB.RB Reshard is to complete the data migration.
    • redis-3.0.6 version Previous migrate operation is a single key operation. Starting with redis-3.0.6, multiple keys are supported on a single migration.
    • Within a Redis cluster, only one slot is allowed to be in the migration state and cannot be migrated slots concurrently.
    • REDIS-TRIB.RB Reshard If an interrupt is performed, fix the cluster status with REDIS-TRIB.RB fix.
2.5 Version selection/upgrade recommendations
    • We have started using the 3.0.7 version, and many 3.2.0 fixed bugs have been backport to this version.
    • We also started testing version 3.2.0, which has a large memory space optimization.
    • Tips
      • REDIS-TRIB.RB supports resharding/rebalance, assigning weights.
      • REDIS-TRIB.RB supports migrating data from a single Redis to a cluster cluster.

The back 2 points is not a pit, is not enough, tips are also very practical. Start sharing the best practices.

3. Best practice 3.1 Apply good fault-tolerant mechanism
    • Connect or request an exception, connect retry and reconnect.
    • Retry time should be greater than cluster-node-time time
      or the emphasis on fault tolerance, this is not for cluster, all application design is applicable.
3.2 Develop the Code of development
    • Slow queries, process CPU 100%, client requests slow, or even time out.
    • Avoid generating Hot-key, which causes the node to become the short board of the system.
    • Avoid generating big-key, resulting in network card explosion, slow query.
    • TTL, set a reasonable TTL, and free up memory. Avoid a large number of keys that expire at the same time period, although Redis has done a lot of optimizations that can still cause requests to become slower.
    • Key naming rules.
    • Avoid using blocking operations, and do not recommend using transactions.
      Develop specifications to make your development use NoSQL in the best way possible.
3.3 Optimizing connection Pooling usage
    • Avoid the server side to maintain a large number of connections.
    • A reasonable connection pool size.
    • A reasonable heartbeat detection time.
    • Quickly frees up the connections that are used.
    • Jedis a connection creation exception problem (fixed):
      https://github.com/xetorthio/jedis/issues/1252

Connectivity issues are the most common problem with redis development, connection Timeout/read timeout, and borrow connection problems.

3.4 Distinguishing the use of redis/twemproxy and cluster
    • Redis recommends using pipeline and multi-keys operations to reduce the number of RTT and increase request efficiency.
    • Twemproxy also supports pipeline, which supports part of the Multi-key to operate.
    • Redis cluster does not recommend using pipeline and Multi-keys operations to reduce the scenes generated by Max redirect.

The distinction between Redis and cluster is caused by data fragmentation on the one hand, and the implementation support of the client is related to it.

3.5 Several parameters that need to be adjusted

1) Set the system parameter Vm.overcommit_memory=1, can avoid bgsave/aofrewrite failure.
2) Setting the timeout value greater than 0 allows Redis to actively free up idle connections.
3) Set Repl-backlog-size 64MB. The default value is 1M, and when the write volume is large, a backlog overflow causes incremental replication to be unsuccessful.
4) client Buffer parameter adjustment
Client-output-buffer-limit Normal 256MB 128MB 60
Client-output-buffer-limit slave 512MB 256MB 180

Iv. operation and maintenance experience Summary 1, automation management
    • The CMDB manages all of the resource information.
    • The agent method reports hard software information.
    • Standardize the underlying settings. Model, OS kernel parameter, software version.
    • Puppet manages and sends standardized configuration files, common task plans, software packages, and operations tools.
    • Resource request self-service.
2. Automatic monitoring
    • Zabbix as the primary monitoring data collection tool.
    • Develop real-time performance dashboard and provide queries for development.
    • Single-machine deployment of multiple Redis, with the help of Zabbix discovery.
    • Develop DB response time monitoring tool Titan.
    • The basic idea originates from pt-query-degest, which generates logs by analyzing TCP response messages. Flume Agent + Kafka collection, spark real-time computing, hbase as storage. Finally, Hotquery/slowquery,request source and other performance data are obtained.
3. Automatic operation and Maintenance
    • Resource request self-service.
    • If the application is reasonable, you can complete the cluster cluster deployment with one click.
      Can not do, it is determined not to do, in addition, monitoring data on development and development is very important, let them know their service performance, sometimes the development will find some unusual behavior of the cluster, such as data but this problem, operation is said so much, behind the dry goods, by deep students developed a few practical tools.
4. Redis Open Source Tool Introduction 4.1 redis live data Migration Tool

1) Online Live migration
2) Redis/twemproxy/cluster heterogeneous clusters to migrate to each other.
3) Github:https://github.com/vipshop/redis-migrate-tool

4.2 Redis Cluster Management tool

1) batch Change cluster parameters
2) clusterrebalance
3) Many features, specifically see GitHub:
Https://github.com/deep011/redis-cluster-tool

4.3 Multi-threaded version Twemproxy

1) significantly increase the throughput of a single proxy, the number of threads can be configured.
2) In the case of pressure measurement, 20 threads reach 50w+qps, the optimal 6 thread reaches 29w.
3) fully compatible with Twemproxy.
4) GitHub:
Https://github.com/vipshop/twemproxies

4.4 Multi-line Redis in development

1) Github:
Https://github.com/vipshop/vire

2) Welcome to participate in collaborative development, this is our project in the development, I hope you can put forward good ideas.

Question and Answer (agronomy and Shen Answer): Issue 1: Version update, does it have any impact on the data?

A: We restarted the upgrade from 2.8.17 to 3.0.3/3.0.7 without any exception. 3.0 to 3.2 We haven't actually upgraded the operation yet.

Question 2: Is there any good way to read and write apart from the next Sentinel mode?

A: We do not read and write the use of separation, read and write are in the maste, too many clusters, management complex; In addition, we also made shards, do not do read and write separation necessary; and we are almost a master one from the node configuration

Problem 3:redis Fork is mainly for the Rdb bar, remove is for what?

Answer: Fork is not friendly

Question 4: How can I ensure that the RDB snapshot is accurate without fork and that there are other cow mechanisms?

A: There are other ways, this is still in the exploratory phase, but the goal is not to fork

Issue 5: There are many problems with the batch operation in Redis cluster mode, but the performance of the business system will be reduced without bulk operation.

A: There is a real problem with the support of the client, but Jedis's authors are also reluctant to support pipeline or some multi key operations. If it is a large number of operations, you can use multithreading to improve client throughput.
(Redis China User group All rights reserved, reprint please indicate source)

Appendix:

Guest: Qunchenmy
Technical blog: [http://mdba.cn]
Weibo: [Http://weibo.com/sylarqun]
Redis China User Group official website: [http://redis.cn]
Redis China User group official Weibo @redis2016
Redis Knowledge Map:
[Http://lib.csdn.net/base/redis]
[HTTP://LIB.CSDN.NET/MOBILE/BASE/34]
Group Two-dimensional code:



Wen/redis Chinese user group (Jane book author)
Original link: http://www.jianshu.com/p/ee2aa7fe341b
Copyright belongs to the author, please contact the author to obtain authorization, and Mark "book author".

Redis China User Group | Cluster Redis mass Production practice

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.