Redis Snapshot file Dump.rdb parsing tool--redis-rdb-tools

Source: Internet
Author: User
Tags base64 control characters diff redis



Parse Redis dump.rdb files, analyze memory and export data to JSON


Rdbtools is a parser for Redis's dump.rdb files. The parser generates events similar to the XML SAX parser, and is very efficient in memory wise.






In addition, Rdbtools also provides utilities:






Generate memory reports for data in all databases and keys



Convert dump file to JSON



Compare two dump files using the standard diff tool



Rdbtools is written in Python, although there are similar projects in other languages. See frequently asked questions for more information.







Installing Rdbtools




Prerequisites:






Redis-py is optional and requires only running test cases.



To install from PyPI (recommended):



Pip Install Rdbtools


Install from Source:




git clone https://github.com/sripathikrishnan/redis-rdb-tools
cd redis-rdb-tools
sudo python setup.py install


Command line Usage Example






Each run of the RDB tool needs to specify a command to indicate what should be done by the resolved RDB data. Valid commands are: json,diff,justkeys,justkeyvals and protocol.






JSON from two database dumps:



rdb --command json /var/redis/6379/dump.rdb
[{
"user003": {"fname": "Ron", "sname": "Bumquist"},
"lizards": ["Bush anole", "Jackson ‘s chameleon", "Komodo dragon", "Ground agama", "Bearded dragon"],
"user001": {"fname": "Raoul", "sname": "Duke"},
"user002": {"fname": "Gonzo", "sname": "Dr"},
"user_list": ["user003", "user002", "user001"]}, {
"baloon": {"helium": "birthdays", "medical": "angioplasty", "weather": "meteorology"},
"armadillo": ["chacoan naked-tailed", "giant", "Andean hairy", "nine-banded", "pink fairy"],
"aroma": {"pungent": "vinegar", "putrid": "rotten eggs", "floral": "roses"}}]
Filtering parsed output





Only the process key that matches the regular expression prints the keys and values:




> rdb --command justkeyvals --key "user.*" /var/redis/6379/dump.rdb
user003 fname Ron,sname Bumquist,
user001 fname Raoul,sname Duke,
user002 fname Gonzo,sname Dr,
user_list user003,user002,user001


Only processes that start with "a" are hashed in database 2:




> rdb -c json --db 2 --type hash --key "a.*" /var/redis/6379/dump.rdb
[{},{
"aroma":{"pungent":"vinegar","putrid":"rotten eggs","floral":"roses"}}]


Convert dump file to JSON






The JSON command output is UTF-8 encoded JSON. By default, callbacks attempt to parse RDB data using UTF-8 and escape non-ASCII printable character \x with \u or non-UTF-8 resolvable bytes. Attempts to decode the RDB data may result in binary data conversions, which can be avoided by using the--escape raw option. Another option is the-e base64 for BASE64 encoded binary data.






Parse the dump file and print the JSON on the standard output:




> rdb -c json /var/redis/6379/dump.rdb
[{
"Citat":["B\u00e4ttre sent \u00e4n aldrig","Bra karl reder sig sj\u00e4lv","Man ska inte k\u00f6pa grisen i s\u00e4cken"],
"bin_data":"\\xFE\u0000\u00e2\\xF2"}]


Resolves the dump file to the original byte and prints the JSON on the standard output:




> rdb -c json /var/redis/6379/dump.rdb --escape raw
[{
"Citat":["B\u00c3\u00a4ttre sent \u00c3\u00a4n aldrig","Bra karl reder sig sj\u00c3\u00a4lv","Man ska inte k\u00c3\u00b6pa grisen i s\u00c3\u00a4cken"],
"bin_data":"\u00fe\u0000\u00c3\u00a2\u00f2"}]


Generating Memory reports





The run-time generates a CSV report with the-C memory, which contains the approximate memory used by the key. --bytes C and '--largest n can be used to limit output to a key greater than C byte or n maximum key.




> rdb -c memory /var/redis/6379/dump.rdb --bytes 128 -f memory.csv
> cat memory.csv
database,type,key,size_in_bytes,encoding,num_elements,len_largest_element
0,list,lizards,241,quicklist,5,19
0,list,user_list,190,quicklist,3,7
2,hash,baloon,138,ziplist,3,11
2,list,armadillo,231,quicklist,5,20
2,hash,aroma,129,ziplist,3,11


The resulting CSV has the following columns-the database number, the data type, the key, the memory used in the bytes, and the RDB encoding type. Memory usage includes keys, values, and any other overhead.






Note that memory usage is approximate. In general, the actual memory used will be slightly higher than the reported memory.






You can filter the report for a key or database number or data type.






Memory reporting should help you detect memory leaks that are caused by application logic. It will also help you optimize Redis memory usage.






Find the memory used by a single key






Sometimes you just want to find the memory used by a particular key, and running the entire memory report on the dump file is time-consuming.






In this case, you can use the following Redis-memory-for-key command:




> redis-memory-for-key person:1
> redis-memory-for-key -s localhost -p 6379 -a mypassword person:1
Key person:1
Bytes111
Typehash
Encodingziplist
Number of Elements2
Length of Largest Element8


Attention:






This is added to the Redis-rdb-tools version 0.1.3



This command relies on the REDIS-PY package



Compare RDB Files






First, use the--command diff option and send the output pipeline to the standard sort utility




> rdb --command diff /var/redis/6379/dump1.rdb | sort > dump1.txt
> rdb --command diff /var/redis/6379/dump2.rdb | sort > dump2.txt


Then, run your favorite diff program



> KDIFF3 dump1.txt dump2.txt


To limit the size of the file, you can use the--key option to filter the key






Issue Redis Protocol






You can use this command to convert an RDB file to a Redis protocol stream protocol.




> rdb --c protocol /var/redis/6379/dump.rdb
*4
$4
HSET
$9
users:123
$9
firstname
$8
Sripathi


You can transfer the output pipeline to Netcat and re-import a subset of the data. For example, if you want to split the data into two Redis instances, you can use the--KEY flag to select a subset of data and then transfer the output pipeline to a running Redis instance to load the data. Read about Redis Mass Insert for more information.






When printing the protocol output, the--escape option can be used for printable or UTF8 to avoid non-printable/control characters.






Using the parser



from rdbtools import RdbParser, RdbCallback
from rdbtools.encodehelpers import bytes_to_unicode
class MyCallback(RdbCallback):
    ‘‘‘ Simple example to show how callback works.
        See RdbCallback for all available callback methods.
        See JsonCallback for a concrete example
    ‘‘‘
    def __init__(self):
        super(MyCallback, self).__init__(string_escape=None)
    def encode_key(self, key):
        return bytes_to_unicode(key, self._escape, skip_printable=True)
    def encode_value(self, val):
        return bytes_to_unicode(val, self._escape)
    def set(self, key, value, expiry, info):
        print(‘%s = %s‘ % (self.encode_key(key), self.encode_value(value)))
    def hset(self, key, field, value):
        print(‘%s.%s = %s‘ % (self.encode_key(key), self.encode_key(field), self.encode_value(value)))
    def sadd(self, key, member):
        print(‘%s has {%s}‘ % (self.encode_key(key), self.encode_value(member)))
    def rpush(self, key, value):
        print(‘%s has [%s]‘ % (self.encode_key(key), self.encode_value(value)))
    def zadd(self, key, score, member):
        print(‘%s has {%s : %s}‘ % (str(key), str(member), str(score)))
callback = MyCallback()
parser = RdbParser(callback)
parser.parse(‘/var/redis/6379/dump.rdb‘)





Redis Snapshot file Dump.rdb parsing tool--redis-rdb-tools


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.