How to effectively insert large amounts of data into Redis

Source: Internet
Author: User
Tags echo command redis server

Recently, a buddy in the group asked, there is a log, which is stored in the IP address (one line), how to quickly import these IP into Redis.

My first suggestion was to shell+redis the client.

Today, look at the Redis official file and discover that the first page of the document (Http://www.redis.io/documentation) has a special topic about "Redis Mass insertion", only to know that his suggestions are low.

The official reasons are as follows:

Using a normal Redis client to perform mass insertion are not a good idea for a few reasons:the naive approach of sending The one command after the and is slow because you has to pay for the round-time for every command. It is possible to use pipelining, but for mass insertion of many records you need to write new commands while you read rep lies at the same time-to-make sure is inserting as fast as possible.

Only a small percentage of clients support non-blocking I/O, and not all the clients is able to parse the replies in an E Fficient on order to maximize throughput. For all this reasons the preferred-mass import data into Redis are to generate a text file containing the Redis prot Ocol, in RAW format, on order to call the commands needed to insert the required data.

The effect is:

1> there is a round trip delay between each Redis client command.

2> only a subset of clients support non-blocking I/O.

The personal understanding is that Redis commands are returned from execution to results, with a certain delay, and even with multiple Redis client-single concurrent insertions, it is difficult to improve throughput because only non-blocking I/O is available for a limited number of connection operations.

So how to efficiently insert it?

The official launch of the 2.6 version of the new feature-pipe mode, will support the Redis protocol text file directly through the pipe to import to the service side.

It is a mouthful, the concrete implementation steps are as follows:

1. Create a new text file containing the Redis command

SET Key0 value0set Key1 Value1 ... SET KeyN Valuen

If you have the raw data, it is not difficult to construct the file, for example, Shell,python can

2. Turn these commands into Redis Protocol.

Because the Redis pipeline feature supports Redis Protocol, it is not a direct redis command.

How to convert, you can refer to the following script.

3. Insert with pipe

Cat Data.txt | REDIS-CLI--pipe

Shell VS Redis Pipe

Here's a test to see the efficiency of the Shell batch import and Redis pipe.

Test ideas: Insert 100,000 of the same data into the database by using shell scripts and Redis pipe, respectively, to see how long each of them takes.

Shell

The script is as follows:

#!/bin/bashfor ((i=0;i<100000;i++)) doecho-en "HelloWorld" | Redis-cli-x Set name$i >>redis.logdone

Each inserted value is HelloWorld, but the key is different, name0,name1...name99999.

Redis Pipe

Redis Pipe will be a little trickier.

1> first constructs a text file for the Redis command

In this case, I chose Python.

#!/usr/bin/pythonfor I in range (100000):    print ' Set name ' +str (i), ' HelloWorld '

# python 1.py > Redis_commands.txt

# head-2 Redis_commands.txt

Set NAME0 helloworldset name1 HelloWorld

2> turn these commands into Redis Protocol

Here, I used a shell script from GitHub,

#!/bin/bashwhile read CMD;  Do  # commands begins with *{number arguments in command}\r\n  xs= ($CMD); printf "*${#XS [@]}\r\n]  # for each Argument, we append ${length}\r\n{argument}\r\n for  X in $CMD; do printf "\$${#X}\r\n$x\r\n"; Donedone < redis_com Mands.txt

# sh 20.sh > Redis_data.txt

# head-7 Redis_data.txt

*3$3set$5name0$10helloworld

At this point, the data is constructed.

Test results

As follows:

Time consumption is not a magnitude at all.

Finally, let's look at the principle of pipe implementation,

    • Redis-cli--pipe tries to send data as fast as possible to the server.
    • At the same time it reads data is available, trying to parse it.
    • Once There is no more data to read from stdin, it sends a special ECHO command with a random bytes string:we Is sure the latest command sent, and we are sure we can match the reply checking if we receive the same bytes As a bulk reply.
    • Once This special final command was sent, the code receiving replies starts to match replies with this bytes. When the matching reply was reached it can exit with success.

That is, it sends the data to the Redis server as quickly as possible and reads and parses the contents of the data file as quickly as possible, and once the contents of the data file have been read, it sends an echo command with a 20-byte string, and the Redis server confirms that the data has been inserted by this command.

Summarize:

Following the curiosity of the children's shoes, the time to construct the Redis command and the time to convert the command into protocol are attached here:

[[Email protected] ~]# time python 1.py > Redis_commands.txtreal    0m0.110suser    0m0.070ssys    0m0.040s[[ Email protected] ~]# time sh 20.sh > Redis_data.txtreal    0m7.112suser    0m5.861ssys    0m1.255s

Reference Documentation:

1. Http://www.redis.io/topics/mass-insert

2. https://gist.github.com/abtrout/432ce44fa77a9620c739

3. http://blog.chinaunix.net/uid-26284395-id-3124337.html

Http://www.cnblogs.com/ivictor/p/5446503.html

How to efficiently insert large amounts of data into Redis (go)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.