Memory savings: Instagram's redis practice (GO)

Source: Internet
Author: User

First, the question:

The large database table data volume (thousands) requires the server to respond to the user's needs more quickly. Second, the solution:1. Cache database data via high-speed server cache 2. In-Memory database third, the main solution cache and database comparisonfrom the above data, we can know that the most feasible technical solutions for our products are two kinds:1.M emcachedMemory Key-value Cache 2. RedisIn-memory database

Four, save memory: Instagram's redis practice

Instagram is the ancestor app for the racquet app and one of the hottest photo apps today, with Instagram having 300 million photos, and on Instagram we need to know who the author of each photo is, Here's how the Instagram team uses Redis to solve this problem and make memory optimizations.

First of all, this application with the image ID to reverse the user UID has the following requirements:

    • Query fast Enough
    • Data to be able to put all in memory, preferably a EC2 high-memory model can be stored (17GB or 34GB, 68GB is too wasteful)
    • To fit Instagram's existing architecture (Instagram has some experience with redis, such as this app)
    • Support persistence so that no further preheating is required after the server restarts

Instagram developers first rejected the database storage scheme, they maintained the kiss principle (Keep It simple and Stupid), because this application simply does not have the database update function, transaction function and correlation query and so on Ox x function, So you don't have to choose to maintain a database for these less-than-used features.

So they chose the Redis,redis is a persistent memory database, all the data is stored in memory (forget the VM bar), and the simplest implementation is to use the REDIS string structure to do a key-value storage on the line. Like this:

SET media:1155315 939GET media:1155315> 939

1155315 is the picture id,939 is the user ID, we will each image ID as key, the user UID as value to save into Key-value pair. Then they tested, storing the data as above, 1,000,000 of the data would use 70MB of memory, and 300,000,000 photos would use 21GB of memory. Compared to the budget of 17GB or overspending.

(Nosqlfan: In fact, we can see an optimization point, we can remove the key value before the same media, only the number, so that the length of the key is reduced, the key value of the memory overhead "NOTE: The Redis key value does not do string-to-number conversion, So what we're saving here is just media: this 6-byte overhead. " After the experiment, the memory consumption will drop to 50MB, the total memory consumption is 15GB, is to meet the demand, but the improvement behind Instagram is still necessary)

So the Instagram developer asked one of the Redis developers Pieter Noordhuis to ask for an optimization solution, and the reply was to use the hash structure. The specific approach is to segment data, each with a hash structure to store, because the hash structure in a single hash element in a certain amount of compressed storage, so you can save a lot of memory. This does not exist in the string structure above. This number is controlled by the Hash-zipmap-max-entries parameter in the configuration file . After the developers have experimented with setting hash-zipmap-max-entries to 1000, the performance is better, and the Hset command will cause CPU consumption to become very large after more than 1000.

So they changed the plan and stored the data in the following structure:

Hset "mediabucket:1155" "1155315" "939" Hget "mediabucket:1155" "1155315" > "939"

By taking the first four bits of the 7-bit image ID as the key value of the hash structure, we guarantee that each hash contains only 3 bits of key, which is 1000.

Once again, the result is that every 1,000,000 key consumes only 16MB of memory. The total memory usage is also reduced to 5GB, which satisfies the application requirements.

(Nosqlfan: Again, here we can still optimize, first of all, the hash structure of the key value into a pure number, so that the key length is reduced by 12 bytes, followed by the hash structure of the subkey value into a three-digit number, which reduces the cost of 4 bytes, as shown below. After experiment, memory consumption will be reduced to 10MB, total memory consumption is 3GB)

Hset "1155" "315" "939" Hget "1155" "315" > "939"

Optimization is endless, just be willing to ponder. I hope you can use memory as well when using storage products.

Reference:

    • Http://blog.nosqlfan.com/html/3379.html
    • Http://instagram-engineering.tumblr.com/post/12202313862/storing-hundreds-of-millions-of-simple-key-value
    • Http://www.open-open.com/lib/view/open1409643182369.html

Memory savings: Instagram's redis practice (GO)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.