How to efficiently insert large amounts of data into Redis (recommended) _redis

Source: Internet
Author: User
Tags echo command mongodb postgresql redis redis server


Recently a buddy in the group asked, there is a log, which is stored in the IP address (one line), how to quickly import these IP into the Redis.



My first suggestion was to shell+redis the client.



Today, looking at Redis official files, found that the first part of the document (Http://www.redis.io/documentation) has a special topic is to tell "Redis Mass insertion", only to know that their suggestions are low.



The reasons given by the authorities are as follows:



Using a normal Redis client to perform mass insertion are not a good idea for a few reasons:the naive approach of sending One command after the "other being slow because" you have to pay for the round trips time for every command. It is possible to use pipelining, but for mass insertion of many records your need to write new commands while you read rep lies at the "same time" sure you are inserting as fast as possible.



Only a small percentage of clients support non-blocking I/O, and not all clients are-able to parse the replies in an E Fficient way in order to maximize throughput. For the, reasons the preferred way to mass import data into Redis be to generate a text file containing the Redis prot Ocol, in RAW format, the commands needed to insert the required data.



The effect is:



1> There is a round-trip delay between each Redis client command.



2> only a subset of the clients support non-blocking I/O.



As a personal understanding, the Redis command has a certain delay from execution to result return, and it is difficult to increase throughput even with multiple Redis client single concurrent inserts, because only non-blocking I/O can only be used for a limited number of connection operations.



So how to efficiently insert it?


Officially, a new feature,-pipe mode, has been introduced in version 2.6, which will be imported directly to the server via pipe, a text file that supports the Redis protocol.



To say a mouthful, the concrete implementation steps are as follows:


1. Create a new text file containing the Redis command


Set Key0 Value0
set Key1 Value1
...
SET Keyn Valuen


If you have the original data, it is not difficult to construct this file, for example, Shell,python can



2. Convert these commands into Redis Protocol.



Because the Redis piping feature supports Redis Protocol, it is not a direct redis command.



How to convert, you can refer to the following script.



3. Insert by pipe


Cat Data.txt | REDIS-CLI--pipe


Shell VS Redis Pipe


Here's a test to see the efficiency between the shell bulk import and the Redis pipe.



Test ideas: Through the shell script and Redis pipe to the database to insert 100,000 of the same data, to see their respective time spent.



Shell


The script is as follows:


#!/bin/bash for
((i=0;i<100000;i++))
do
echo-en "HelloWorld" | redis-cli-x set name$i >> Redis.log done


The value of each insertion is HelloWorld, but the key is different, name0,name1...name99999.



Redis Pipe


Redis Pipe's going to be a little bit messy.



1> first constructs a text file for the Redis command



Here, I chose Python.


#!/usr/bin/python
for I in Range (100000):
  print ' Set name ' +str (i), ' HelloWorld '


# python 1.py > Redis_commands.txt



# head-2 Redis_commands.txt


Set NAME0 HelloWorld
set name1 HelloWorld


2> converts these commands into Redis Protocol



Here, I took advantage of the GitHub last shell script,


#!/bin/bash while

read CMD. Do
 # Each command begins with *{number arguments in command}\r\n
 xs= ($CMD); printf "*${#XS [@]}\r\n"
 # for each argument, we append ${length}\r\n{argument}\r\n to
 X in $CMD; do printf \$${#X}\r\ N$x\r\n "; Done
< Redis_commands.txt


# sh 20.sh > Redis_data.txt



# head-7 Redis_data.txt


*3
$
Set
$
NAME0
$
HelloWorld


At this point, the data is constructed.



Test results



As follows:



Time consumption is not a measure of magnitude at all.



Finally, to see the principle of the realization of pipe,


    • Redis-cli--pipe tries to send data as fast as possible to the server.
    • At the same time it reads data when available, trying to parse it.
    • Once There is no further data to read from stdin, it sends a special ECHO command with a random bytes string:we are This is the latest command sent, and we are sure we can match the reply checking if we receive the same bytes as a bulk Reply.
    • Once This special final command are sent, the code receiving replies starts to match replies and this bytes. When the matching reply are reached it can exit with success.


That is, it will send the data to the Redis server as quickly as possible, read and parse the contents of the data file as quickly as possible, and once the contents of the data file have been read, it sends an echo command with a 20-byte string, which confirms that the data has been inserted by the Redis server.



Summarize:



The follow-up has children's shoes curiosity, constructs the time of the Redis command and converts the order to protocol time, here together:


[Root@mysql-server1 ~]# time Python 1.py > Redis_commands.txt

real  0m0.110s
user  0m0.070s
sys  0m0.040s
[Root@mysql-server1 ~]# time sh 20.sh > Redis_data.txt

real  0m7.112s
user  0m5.861s
sys  0m1.255s


The above is the entire content of this article, I hope to help you learn, but also hope that we support the cloud habitat community.


Alibaba Cloud Hot Products

Elastic Compute Service (ECS) Dedicated Host (DDH) ApsaraDB RDS for MySQL (RDS) ApsaraDB for PolarDB(PolarDB) AnalyticDB for PostgreSQL (ADB for PG)
AnalyticDB for MySQL(ADB for MySQL) Data Transmission Service (DTS) Server Load Balancer (SLB) Global Accelerator (GA) Cloud Enterprise Network (CEN)
Object Storage Service (OSS) Content Delivery Network (CDN) Short Message Service (SMS) Container Service for Kubernetes (ACK) Data Lake Analytics (DLA)

ApsaraDB for Redis (Redis)

ApsaraDB for MongoDB (MongoDB) NAT Gateway VPN Gateway Cloud Firewall
Anti-DDoS Web Application Firewall (WAF) Log Service DataWorks MaxCompute
Elastic MapReduce (EMR) Elasticsearch

Alibaba Cloud Free Trail

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.