How to efficiently insert large amounts of data into Redis (recommended)

How to efficiently insert large amounts of data into Redis (recommended) _redis

Last Update:2017-01-18 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Recently a buddy in the group asked, there is a log, which is stored in the IP address (one line), how to quickly import these IP into the Redis.

My first suggestion was to shell+redis the client.

Today, looking at Redis official files, found that the first part of the document (Http://www.redis.io/documentation) has a special topic is to tell "Redis Mass insertion", only to know that their suggestions are low.

The reasons given by the authorities are as follows:

Using a normal Redis client to perform mass insertion are not a good idea for a few reasons:the naive approach of sending One command after the "other being slow because" you have to pay for the round trips time for every command. It is possible to use pipelining, but for mass insertion of many records your need to write new commands while you read rep lies at the "same time" sure you are inserting as fast as possible.

Only a small percentage of clients support non-blocking I/O, and not all clients are-able to parse the replies in an E Fficient way in order to maximize throughput. For the, reasons the preferred way to mass import data into Redis be to generate a text file containing the Redis prot Ocol, in RAW format, the commands needed to insert the required data.

The effect is:

1> There is a round-trip delay between each Redis client command.

2> only a subset of the clients support non-blocking I/O.

As a personal understanding, the Redis command has a certain delay from execution to result return, and it is difficult to increase throughput even with multiple Redis client single concurrent inserts, because only non-blocking I/O can only be used for a limited number of connection operations.

So how to efficiently insert it?

Officially, a new feature,-pipe mode, has been introduced in version 2.6, which will be imported directly to the server via pipe, a text file that supports the Redis protocol.

To say a mouthful, the concrete implementation steps are as follows:

1. Create a new text file containing the Redis command

Set Key0 Value0
set Key1 Value1
...
SET Keyn Valuen

If you have the original data, it is not difficult to construct this file, for example, Shell,python can

2. Convert these commands into Redis Protocol.

Because the Redis piping feature supports Redis Protocol, it is not a direct redis command.

How to convert, you can refer to the following script.

3. Insert by pipe

Cat Data.txt | REDIS-CLI--pipe

Shell VS Redis Pipe

Here's a test to see the efficiency between the shell bulk import and the Redis pipe.

Test ideas: Through the shell script and Redis pipe to the database to insert 100,000 of the same data, to see their respective time spent.

Shell

The script is as follows:

#!/bin/bash for
((i=0;i<100000;i++))
do
echo-en "HelloWorld" | redis-cli-x set name$i >> Redis.log done

The value of each insertion is HelloWorld, but the key is different, name0,name1...name99999.

Redis Pipe

Redis Pipe's going to be a little bit messy.

1> first constructs a text file for the Redis command

Here, I chose Python.

#!/usr/bin/python
for I in Range (100000):
  print ' Set name ' +str (i), ' HelloWorld '

# python 1.py > Redis_commands.txt

# head-2 Redis_commands.txt

Set NAME0 HelloWorld
set name1 HelloWorld

2> converts these commands into Redis Protocol

Here, I took advantage of the GitHub last shell script,

#!/bin/bash while

read CMD. Do
 # Each command begins with *{number arguments in command}\r\n
 xs= ($CMD); printf "*${#XS [@]}\r\n"
 # for each argument, we append ${length}\r\n{argument}\r\n to
 X in $CMD; do printf \$${#X}\r\ N$x\r\n "; Done
< Redis_commands.txt

# sh 20.sh > Redis_data.txt

# head-7 Redis_data.txt

*3
$
Set
$
NAME0
$
HelloWorld

At this point, the data is constructed.

Test results

As follows:

Time consumption is not a measure of magnitude at all.

Finally, to see the principle of the realization of pipe,

Redis-cli--pipe tries to send data as fast as possible to the server.
At the same time it reads data when available, trying to parse it.
Once There is no further data to read from stdin, it sends a special ECHO command with a random bytes string:we are This is the latest command sent, and we are sure we can match the reply checking if we receive the same bytes as a bulk Reply.
Once This special final command are sent, the code receiving replies starts to match replies and this bytes. When the matching reply are reached it can exit with success.

That is, it will send the data to the Redis server as quickly as possible, read and parse the contents of the data file as quickly as possible, and once the contents of the data file have been read, it sends an echo command with a 20-byte string, which confirms that the data has been inserted by the Redis server.

Summarize:

The follow-up has children's shoes curiosity, constructs the time of the Redis command and converts the order to protocol time, here together:

[Root@mysql-server1 ~]# time Python 1.py > Redis_commands.txt

real  0m0.110s
user  0m0.070s
sys  0m0.040s
[Root@mysql-server1 ~]# time sh 20.sh > Redis_data.txt

real  0m7.112s
user  0m5.861s
sys  0m1.255s

The above is the entire content of this article, I hope to help you learn, but also hope that we support the cloud habitat community.

Alibaba Cloud Hot Products
Elastic Compute Service (ECS)	Dedicated Host (DDH)	ApsaraDB RDS for MySQL (RDS)	ApsaraDB for PolarDB(PolarDB)	AnalyticDB for PostgreSQL (ADB for PG)
AnalyticDB for MySQL(ADB for MySQL)	Data Transmission Service (DTS)	Server Load Balancer (SLB)	Global Accelerator (GA)	Cloud Enterprise Network (CEN)
Object Storage Service (OSS)	Content Delivery Network (CDN)	Short Message Service (SMS)	Container Service for Kubernetes (ACK)	Data Lake Analytics (DLA)
ApsaraDB for Redis (Redis)	ApsaraDB for MongoDB (MongoDB)	NAT Gateway	VPN Gateway	Cloud Firewall
Anti-DDoS	Web Application Firewall (WAF)	Log Service	DataWorks	MaxCompute
Elastic MapReduce (EMR)	Elasticsearch

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

How to efficiently insert large amounts of data into Redis (recommended) _redis

Alibaba Cloud Hot Products

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support