Recently a buddy in the group asked, there is a log, which is stored in the IP address (one line), how to quickly import these IP into the Redis.
My first suggestion was to shell+redis the client.
Today, looking at Redis official files, found that the first part of the document (Http://www.redis.io/documentation) has a special topic is to tell "Redis Mass insertion", only to know that their suggestions are low.
The reasons given by the authorities are as follows:
Using a normal Redis client to perform mass insertion are not a good idea for a few reasons:the naive approach of sending One command after the "other being slow because" you have to pay for the round trips time for every command. It is possible to use pipelining, but for mass insertion of many records your need to write new commands while you read rep lies at the "same time" sure you are inserting as fast as possible.
Only a small percentage of clients support non-blocking I/O, and not all clients are-able to parse the replies in an E Fficient way in order to maximize throughput. For the, reasons the preferred way to mass import data into Redis be to generate a text file containing the Redis prot Ocol, in RAW format, the commands needed to insert the required data.
The effect is:
1> There is a round-trip delay between each Redis client command.
2> only a subset of the clients support non-blocking I/O.
As a personal understanding, the Redis command has a certain delay from execution to result return, and it is difficult to increase throughput even with multiple Redis client single concurrent inserts, because only non-blocking I/O can only be used for a limited number of connection operations.
So how to efficiently insert it?
Officially, a new feature,-pipe mode, has been introduced in version 2.6, which will be imported directly to the server via pipe, a text file that supports the Redis protocol.
To say a mouthful, the concrete implementation steps are as follows:
1. Create a new text file containing the Redis command
Set Key0 Value0
set Key1 Value1
...
SET Keyn Valuen
If you have the original data, it is not difficult to construct this file, for example, Shell,python can
2. Convert these commands into Redis Protocol.
Because the Redis piping feature supports Redis Protocol, it is not a direct redis command.
How to convert, you can refer to the following script.
3. Insert by pipe
Cat Data.txt | REDIS-CLI--pipe
Shell VS Redis Pipe
Here's a test to see the efficiency between the shell bulk import and the Redis pipe.
Test ideas: Through the shell script and Redis pipe to the database to insert 100,000 of the same data, to see their respective time spent.
Shell
The script is as follows:
#!/bin/bash for
((i=0;i<100000;i++))
do
echo-en "HelloWorld" | redis-cli-x set name$i >> Redis.log done
The value of each insertion is HelloWorld, but the key is different, name0,name1...name99999.
Redis Pipe
Redis Pipe's going to be a little bit messy.
1> first constructs a text file for the Redis command
Here, I chose Python.
#!/usr/bin/python
for I in Range (100000):
print ' Set name ' +str (i), ' HelloWorld '
# python 1.py > Redis_commands.txt
# head-2 Redis_commands.txt
Set NAME0 HelloWorld
set name1 HelloWorld
2> converts these commands into Redis Protocol
Here, I took advantage of the GitHub last shell script,
#!/bin/bash while
read CMD. Do
# Each command begins with *{number arguments in command}\r\n
xs= ($CMD); printf "*${#XS [@]}\r\n"
# for each argument, we append ${length}\r\n{argument}\r\n to
X in $CMD; do printf \$${#X}\r\ N$x\r\n "; Done
< Redis_commands.txt
# sh 20.sh > Redis_data.txt
# head-7 Redis_data.txt
*3
$
Set
$
NAME0
$
HelloWorld
At this point, the data is constructed.
Test results
As follows:
Time consumption is not a measure of magnitude at all.
Finally, to see the principle of the realization of pipe,
- Redis-cli--pipe tries to send data as fast as possible to the server.
- At the same time it reads data when available, trying to parse it.
- Once There is no further data to read from stdin, it sends a special ECHO command with a random bytes string:we are This is the latest command sent, and we are sure we can match the reply checking if we receive the same bytes as a bulk Reply.
- Once This special final command are sent, the code receiving replies starts to match replies and this bytes. When the matching reply are reached it can exit with success.
That is, it will send the data to the Redis server as quickly as possible, read and parse the contents of the data file as quickly as possible, and once the contents of the data file have been read, it sends an echo command with a 20-byte string, which confirms that the data has been inserted by the Redis server.
Summarize:
The follow-up has children's shoes curiosity, constructs the time of the Redis command and converts the order to protocol time, here together:
[Root@mysql-server1 ~]# time Python 1.py > Redis_commands.txt
real 0m0.110s
user 0m0.070s
sys 0m0.040s
[Root@mysql-server1 ~]# time sh 20.sh > Redis_data.txt
real 0m7.112s
user 0m5.861s
sys 0m1.255s
The above is the entire content of this article, I hope to help you learn, but also hope that we support the cloud habitat community.