Recently, a buddy in the group asked, there is a log, which is stored in the IP address (one line), how to quickly import these IP into Redis.
My first suggestion was to shell+redis the client.
Today, look at the Redis official file and discover that the first page of the document (Http://www.redis.io/documentation) has a special topic about "Redis Mass insertion", only to know that his suggestions are low.
The official reasons are as follows:
Using a normal Redis client to perform mass insertion are not a good idea for a few reasons:the naive approach of sending The one command after the and is slow because you has to pay for the round-time for every command. It is possible to use pipelining, but for mass insertion of many records you need to write new commands while you read rep lies at the same time-to-make sure is inserting as fast as possible.
Only a small percentage of clients support non-blocking I/O, and not all the clients is able to parse the replies in an E Fficient on order to maximize throughput. For all this reasons the preferred-mass import data into Redis are to generate a text file containing the Redis prot Ocol, in RAW format, on order to call the commands needed to insert the required data.
The effect is:
1> there is a round trip delay between each Redis client command.
2> only a subset of clients support non-blocking I/O.
The personal understanding is that Redis commands are returned from execution to results, with a certain delay, and even with multiple Redis client-single concurrent insertions, it is difficult to improve throughput because only non-blocking I/O is available for a limited number of connection operations.
So how to efficiently insert it?
The official launch of the 2.6 version of the new feature-pipe mode, will support the Redis protocol text file directly through the pipe to import to the service side.
It is a mouthful, the concrete implementation steps are as follows:
1. Create a new text file containing the Redis command
SET Key0 value0set Key1 Value1 ... SET KeyN Valuen
If you have the raw data, it is not difficult to construct the file, for example, Shell,python can
2. Turn these commands into Redis Protocol.
Because the Redis pipeline feature supports Redis Protocol, it is not a direct redis command.
How to convert, you can refer to the following script.
3. Insert with pipe
Cat Data.txt | REDIS-CLI--pipe
Shell VS Redis Pipe
Here's a test to see the efficiency of the Shell batch import and Redis pipe.
Test ideas: Insert 100,000 of the same data into the database by using shell scripts and Redis pipe, respectively, to see how long each of them takes.
Shell
The script is as follows:
#!/bin/bashfor ((i=0;i<100000;i++)) doecho-en "HelloWorld" | Redis-cli-x Set name$i >>redis.logdone
Each inserted value is HelloWorld, but the key is different, name0,name1...name99999.
Redis Pipe
Redis Pipe will be a little trickier.
1> first constructs a text file for the Redis command
In this case, I chose Python.
#!/usr/bin/pythonfor I in range (100000): print ' Set name ' +str (i), ' HelloWorld '
# python 1.py > Redis_commands.txt
# head-2 Redis_commands.txt
Set NAME0 helloworldset name1 HelloWorld
2> turn these commands into Redis Protocol
Here, I used a shell script from GitHub,
#!/bin/bashwhile read CMD; Do # commands begins with *{number arguments in command}\r\n xs= ($CMD); printf "*${#XS [@]}\r\n] # for each Argument, we append ${length}\r\n{argument}\r\n for X in $CMD; do printf "\$${#X}\r\n$x\r\n"; Donedone < redis_com Mands.txt
# sh 20.sh > Redis_data.txt
# head-7 Redis_data.txt
*3$3set$5name0$10helloworld
At this point, the data is constructed.
Test results
As follows:
Time consumption is not a magnitude at all.
Finally, let's look at the principle of pipe implementation,
- Redis-cli--pipe tries to send data as fast as possible to the server.
- At the same time it reads data is available, trying to parse it.
- Once There is no more data to read from stdin, it sends a special ECHO command with a random bytes string:we Is sure the latest command sent, and we are sure we can match the reply checking if we receive the same bytes As a bulk reply.
- Once This special final command was sent, the code receiving replies starts to match replies with this bytes. When the matching reply was reached it can exit with success.
That is, it sends the data to the Redis server as quickly as possible and reads and parses the contents of the data file as quickly as possible, and once the contents of the data file have been read, it sends an echo command with a 20-byte string, and the Redis server confirms that the data has been inserted by this command.
Summarize:
Following the curiosity of the children's shoes, the time to construct the Redis command and the time to convert the command into protocol are attached here:
[[Email protected] ~]# time python 1.py > Redis_commands.txtreal 0m0.110suser 0m0.070ssys 0m0.040s[[ Email protected] ~]# time sh 20.sh > Redis_data.txtreal 0m7.112suser 0m5.861ssys 0m1.255s
Reference Documentation:
1. Http://www.redis.io/topics/mass-insert
2. https://gist.github.com/abtrout/432ce44fa77a9620c739
3. http://blog.chinaunix.net/uid-26284395-id-3124337.html
Http://www.cnblogs.com/ivictor/p/5446503.html
How to efficiently insert large amounts of data into Redis (go)