(Switch) redis cluster Solution

Last Update:2014-08-18 Source: Internet

Author: User

Tags benchmark redis version redis cluster

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

A solution based on some tests:

1. redis Performance

Some simple tests on redis are for reference only:

Test environment: redhat6.2, Xeon e5520 (4-core) * 2/8g, M Nic

Redis version: 2.6.9

The client machine uses redis-benchmark for simple get and set operations:
1. 1 test a single instance
1. Value size: 10byte ~ 1390 bytes

Processing speed: 7.5 w/s. The processing speed is limited by the processing capability of a single thread.

2. Value size: about 1400

The processing speed suddenly drops to 5 w/s, and the network adapter fails to run full; because the request packet is larger than MTU, the TCP packet is subcontracted, and the server's interrupted Processing request doubles, resulting in a sharp decrease in the business.

3. Value:> 1.5 K

M Nic full, speed limited by Nic speed

The approximate relationship between processing speed and package size is as follows:

1.2 Multi-instance Test
The premise is that the system Nic Soft Interrupt is balanced to multiple CPU cores for processing. The test machine Nic enables RSS and has 16 Queues:

Operation: 10-byte value set. Eight instances are enabled on the server, and two redis-benchmark instances are enabled on each of the four client servers. The speed of each client is nearly 4 w/s, the total processing time on the server is about 30 W/s.

Nic traffic:

Among them, all eight single-core CPUs are exhausted. For example, if hyper-threading is not used, the test has achieved good results and the test has not continued. A single instance is 7.5 w/s full of a core, and eight instances are full of 8 cores. The CPU usage and performance improvement are not proportional to 30 W/s, RSS causes the redis-server thread to switch the CPU core once every time a request is received, and the CPU usage of the Soft Interrupt is too high. In this case, the RPS/RFS function may be very suitable. RSS only needs to map 1 ~ Two cores, and then soft interruptions are dynamically forwarded based on the redis-server port to ensure that all redis processes are executed on one core, reducing unnecessary process switching.

Enabling multiple instances makes full use of the system CPU and nic packet processing capabilities. For specific business scenarios, consider the average package size, CPU consumption, and business volume. If multiple instances are used to improve the processing capability, you must configure the NIC Soft Interrupt balancing. Otherwise, the processing capability cannot be improved.

2. redis persistence

Test policy: aof + timed rewriteaof

1. Prepare the data volume:

0.1 billion, key: 12 bytes value: 15 bytes, stored as string, the process occupies 12 GB of memory

2. Dump

File Size: 2.8 GB, execution time: 95 s, restart loading time: 112 s

2. bgrewriteaof

File Size: 5.1 GB, execution time: 95 s, restart loading time: 165 S

3. Performance impact after aof is enabled (fsync once per second ):

8 K/s set operation: CPU increased from 20% to 40%

4. Modify 1kw data:

File Size: 5.6 GB, restart and load time: 194 s

5. Modify 2 k Data

File Size: 6.1 GB, restart and load time: 200 s

In addition, fsync has been optimized in Versions later than redis2.4. bgrewriteaof and bgsave have no impact on the external service provision of redis.

3. redis master-slave Replication

Because the current version does not have Incremental backup like MySQL Master/Slave, it has high requirements on Network stability. Frequent TCP connection disconnections will cause a great burden on the server and the network.

In the current production environment, master and slave machines are deployed under the same rack, and will not be re-connected for several months.

4. Introduction to keepalived

Reference official documents: http://keepalived.org/pdf/sery-lvs-cluster.pdf

Keepalived is a routing selection software written in C. It works with ipvs load balancing to provide high availability through vrrp protocol. Currently, the latest version 1.2.7.keepalived provides vrrp routing protocol to switch between virtual IP addresses (VIPS). The switching speed is several seconds and there is no split-brain problem. Yes

One master node, multiple slave nodes, automatic master node and backup node election, VIP drift, switching speed in seconds; during switching, you can run the specified script to change the service status.

For example, you can switch between two hosts a and B as follows:

1. Start A and B in sequence, and a acts as the master and B as the slave

2. Master A fails, and B takes over the business.

3. A is used as a slave slaveof B

4. B goes down and a switches back to the master.

You can use one instance as the master instance to implement master-slave data and read/write splitting. You can also use multiple VIPs to back up one half of the master data and one half of the Slave Data on one instance, both machines are responsible for part of the business at the same time. After one machine is down, the business is concentrated on one platform.

Installation and configuration are relatively simple:

Dependent package: OpenSSL-devel (libssl-Dev in Ubuntu) and popt-devel (libpopt-Dev in Ubuntu ).

The default path of the configuration file is/etc/keepalived. conf. You can also manually specify the path, but note that you need to manually specify the absolute path. Make sure that the configuration file is correct. keepalived does not check whether the configuration meets the rules.

Run with keepalived-D to start three Daemon Processes: one parent process, one check health check, one vrrp, and one-D to write logs to/var/log/message, you can view the switch status in the log.

Note:

1. vrrp is a multicast protocol, which must ensure that the master, slave, and VIP are all under the same VLAN.

2. Different VIPs must correspond to different vrids. vrids in one VLAN cannot conflict with other groups.

3. in keepalived, there are two roles: Master (one) and backup (multiple). If one is set to master, but after the master is down, it is necessary to switch the business again, this is unacceptable for stateful services. Solution: both machines are set to backup, and the priority backup is set to nopreemts without preemption.

5. high-availability solution implemented through keepalived

Switching process:

1. When the master node fails, the VIP will drift to the slave. The keepalived on the slave notifies redis to execute: slaveof no one to start providing services.

2. When the master node is up, the VIP address remains unchanged. The keepalived of the master node notifies redis to execute the slaveof slave IP host and start as the slave node for data synchronization.

3. and so on

Both master and slave are downMachine status:

1. This problem is not planned and is generally not considered.

2. Scheduled restart: Save dump master database data through O & M before restart. Pay attention to the sequence:

1. Shut down all redis on one of the machines, so that the master node is switched to another machine (deployed on multiple instances, where both the master and slave nodes exist on the Single Machine), and shut down the machine.

2. Dump the primary redis service in turn

3. Disable the master

4. Start the master instance and wait for the data load to complete.

5. Start from

Delete the dump file (avoid restarting and Loading slowly)

6. Use twemproxy to implement the cluster Solution

A c-version proxy open-source by Twitter, supporting both memcached and redis, the latest version is: 0.2.4, under development; https://github.com/twitter/twemproxy. Twitter uses it to reduce the number of network connections between front-end and cache services.

Features: fast and lightweight, reduces the number of backend cache server connections, easy configuration, and supports ketama, modula, random, and common hash partitioning algorithms.

Here, keepalived is used to implement a highly available Master/Slave solution to solve the single point of failure (spof) Problem of proxy;

Advantages:

1. for clients, redis clusters are transparent and the clients are simple, and dynamic resizing is carried out.

2. When the proxy is a single point and consistent hash processing, there is no split-brain problem in cluster node availability detection.

3. High Performance and CPU-intensive. redis node clusters have redundant CPU resources and can be deployed on redis node clusters without additional devices.

7. Consistent hash

Use zookeeper to implement consistent hash.

When the redis service starts, it writes its route information to ZK through a temporary node, and the client reads available route information through ZK client.

For specific implementation, see my other article: redis consistent hash

8. monitoring tools

Query history redis running: CPU, memory, hit rate, Request volume, master-slave switchover, etc.

Real-time Monitoring Curve

SMS alert

The open-source redis live modification tool is used to facilitate batch instance monitoring. The basic functions have been implemented and the details will be improved gradually.

The Source Code address is as follows:

Https://github.com/LittlePeng/redis-monitor

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More