Memory saving: Instagram redis practices

Source: Internet
Author: User

From: http://blog.nosqlfan.com/html/3379.html? Ref = rediszt

InstagramIt can be said that it is the first application of the online photo app and one of the hottest photo apps currently. The number of Instagram photos has reached 0.3 billion, while in Instagram, we need to know who the author of each photo is. below is how the Instagram team uses redis to solve this problem and optimize the memory.

First, this image ID lookup application has the following requirements:

    • Fast enough
    • Data should be stored in the memory, preferably on an EC2 high-Memory Model (17 GB or 34 GB, 68 Gb is too wasteful)
    • Suitable for the existing Instagram ArchitectureRedisHave some experience, such as this application)
    • Persistence is supported, so that no preheating is required after the server is restarted.

Instagram developers first deny the database storage solution. They maintain the KISS Principle (keep it simple and stupid) because this application does not use the database update function at all, the transaction and associated query functions are not required to maintain a database for these functions.

So they chose redis. redis is a persistent memory database, and all the data is stored in the memory (forget the VM ), the simplest implementation is to use the redis String Structure for a key-value storage. Like this:

 
Set media: 1155315 939get media: 1155315> 939

1155315 is the image ID and 939 is the user ID. We use each image ID as the key, and the user UID as the value to save it as the key-value pair. Then they tested and stored the data in the above method. 1,000,000 of the data would use 70 MB of memory, and 300,000,000 photos would use 21 GB of memory. Compared with the budget of 17 GB, the cost is overspending.

(Nosqlfan:In fact, here we can see an optimization point. We can remove the media with the same key value and store only numbers, so that the length of the key is reduced, reduce the memory overhead of the key value. [Note: redis's key value will not be converted from a string to a number, so we only save media: the overhead of these 6 bytes ]. After experiment,Memory usageIt will be reduced to 50 MB, and the total memory usage is 15 GB, which meets the needs, but any improvement after Instagram is necessary.)

As a result, the Instagram developer asked Pieter noordhuis, one of redis developers, about the optimization scheme. The reply was that the hash structure was used. The specific method is to segment data and store each segment in a hash structure. Because the hash structure compresses and stores a single hash element when there are not a certain number of hash elements, a large numberMemory saving. This does not exist in the above string structure. A certain numberHash-Zipmap-max-entries parameter. After a developer's experiment, when hash-zipmap-max-entries is set to 1000, the performance is better. If the hash-zipmap-max-entries is set to 1000, The hset command will cause a very high CPU consumption.

So they changed the scheme and saved the data into the following structure:

Hset "mediabucket: 1155" "1155315" "939" hget "mediabucket: 1155" "1155315"> "939"

By taking the first four digits of the 7-Digit Image ID as the key value of the hash structure, each hash contains only three keys, namely, 1000.

In another experiment, the result is that every 1,000,000 keys only consume 16 MB of memory. The total memory usage also drops to 5 GB, meeting application requirements.

(Nosqlfan: Similarly, We can optimize it here. First, we can change the key value of the hash structure to a pure number, which reduces the key length by 12 bytes, the second is to change the subkey value in the hash structure to three digits, which reduces the overhead of 4 bytes, as shown below. After the experiment, the memory usage will be reduced to 10 MB, and the total memory usage is 3 GB.)

 
Hset "1155" "315" "939" hget "1155" "315"> "939"

There is no end to optimization. We hope that you can cherish the memory when using storage products.

Source: instagram-engineering.tumblr.com

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.