Memory savings: Instagram's redis practice

Source: Internet
Author: User

Add by Zhj: This article only translated a part, more analysis to refer to the original English

Save Memory: The Redis practice of Instagram

English Original: Storing hundreds of millions of simple key-value pairs in Redis

Instagram is the ancestor app for the racquet app and one of the hottest photo apps today, with Instagram having 300 million photos, and on Instagram we need to know who the author of each photo is, Here's how the Instagram team uses Redis to solve this problem and make memory optimizations.

First of all, this application with the image ID to reverse the user UID has the following requirements:

    • Query fast Enough
    • Data to be able to put all in memory, preferably a EC2 high-memory model can be stored (17GB or 34GB, 68GB is too wasteful)
    • To fit Instagram's existing architecture (Instagram has some experience with redis, such as this app)
    • Support persistence so that no further preheating is required after the server restarts

Instagram developers first rejected the database storage scheme, they maintained the kiss principle (Keep It simple and Stupid), because this application simply does not have the database update function, transaction function and correlation query and so on Ox x function, So you don't have to choose to maintain a database for these less-than-used features.

So they chose the Redis,redis is a persistent memory database, all the data is stored in memory (forget the VM bar), and the simplest implementation is to use the REDIS string structure to do a key-value storage on the line. Like this:

SET media:1155315 939GET media:1155315> 939

1155315 is the picture id,939 is the user ID, we will each image ID as key, the user UID as value to save into Key-value pair. Then they tested, storing the data as above, 1,000,000 of the data would use 70MB of memory, and 300,000,000 photos would use 21GB of memory. Compared to the budget of 17GB or overspending.

So the Instagram developer asked one of the Redis developers Pieter Noordhuis to ask for an optimization solution, and the reply was to use the hash structure. The specific approach is to segment data, each with a hash structure to store, because the hash structure in a single hash element in a certain amount of compressed storage, so you can save a lot of memory. This does not exist in the string structure above. This number is controlled by the Hash-zipmap-max-entries parameter in the configuration file . After the developers have experimented with setting hash-zipmap-max-entries to 1000, the performance is better, and the Hset command will cause CPU consumption to become very large after more than 1000.

So they changed the plan and stored the data in the following structure:

Hset "mediabucket:1155" "1155315" "939" Hget "mediabucket:1155" "1155315" > "939"

By taking the first four bits of the 7-bit image ID as the key value of the hash structure, we guarantee that each hash contains only 3 bits of key, which is 1000.

Save memory: Instagram's redis practice (GO)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.