Facebook stores 6.5 billion-photo storage frames

Last Update:2018-07-26 Source: Internet

Author: User

Tags memcached

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Facebook stores 6.5 billion photos of the storage frame <?xml:namespace prefix = o ns = "Urn:schemas-microsoft-com:office:office"/>

Facebook has never been used, but it is still interested in the large capacity of unstructured data storage solutions. This article is through the Online network broadcast (webcast) through my translation, therefore, I do not ensure that the contents of this article and the original webcast no deviation.

So Facebook needs to read and store data efficiently in a huge amount of data storage. Facebook currently has 25 development engineers in charge of development, and most of their work is focused on how to optimize the performance of Facebook's database, how to make Facebook's database more scalable, and so on.

Currently, Facebook has thousands of servers (although webcast has a news story that says it has reached 10,000 servers, webcast Jason is talking about thousands of--thousands of servers), Most of these servers have a clear division of labor, their responsibilities; use the MySQL database; PHP runs on Apache and uses C to develop some extensions and encapsulation; Between the Facebook application layer and the database layer, they use the memcached technology as a cache.

Tip: What is memcached.

Memcached is the first universal distributed memory cache system developed by Danga Interactive for Live Journal. It typically speeds up dynamic database-driven Web sites by caching objects and data into memory, reducing the number of database accesses. Memcached lacks the security authentication feature, which means it must run within a properly configured firewall. The Memcached API provides a huge hash table that is distributed across multiple computers. When the hash table is full, the newly inserted data is scrubbed from the older data using the LRU algorithm (least recently using the algorithm). Sites like YouTube, LiveJournal, Wikipedia, SourceForge, Facebook and nytimes.com all use memcached.

Facebook Current Storage architecture

Note: For the moment I call it the current storage architecture for two reasons: the first one is the "State of the World" in Webcast's ppt, which shows the current situation; The second is that the new storage architecture haystack did not explicitly indicate that it has entered the production environment in webcast. If the description is wrong, you are welcome to correct it.

Facebook's storage architecture needs to deal with two things: Read and write. We will observe separately.

Write:

When a user needs to upload a photo, that is, a write to the Facebook store, Facebook assigns an upload server to work. The upload server processes the uploaded photos, and each photo is stored according to the 4~5 size.

read:

When a user needs to read a photo, the schema is slightly more complex:

When a user clicks on a photo, the request will be uploaded via HTTP to the CDN (Content Delivery Network), the CDN has cached some photos, if the requested photos in the CDN, you can return the photos to the user. If the photo is not in the CDN, then two things happen.

If a normal photo is requested, the request is passed to a different photo server (Photo server), where the photo server reads the data stored by the backend NetApp.

If the request is for a picture on the profile, the request to read the photo is called "CACHR," and CACHR is a lightweight server that caches basic data-related photos. If you still don't find a photo in CACHR, you will need to read the data from the backend NetApp storage via the photo server (Photo server).

At present, Facebook's cachr scalability and stability is good, the basic information (profile) of the photos on the request is about 200,000 times/sec, the total number of requests for more than half, Facebook has about 40 CACHR nodes, running more than 4 years, not a big problem. The photo server (Photo server) uses file processing caching (Handle cache) technology to improve the read performance of photos.

Architectural Issues

Just briefly introduced the entire Facebook storage architecture, but there are some problems with this architecture:

First, both NetApp and other file systems contribute to the explosive growth of Meta data (metadata). They use a single file metadata (metadata), and each file system enforces a hierarchical structure such as a directory, which increases the amount of content unrelated to the really needed data. So, for Facebook, approximately 3 disk IO is required to successfully read a picture (the number of IO needs to be 15 times at the beginning of the design, so 3 times is already a very optimized result).

Second, it is because of the increased cost of image reading in the file system that Facebook has to rely more heavily on efficient data caching to resolve the conflicts caused by disk IO. At present, the basic data access rate in the CDN of the Facebook is 99.8%, the average image hits 92%, which reduces the access to the file system.

Haystack

The reason Facebook introduced haystack was to reduce the use of metadata (metadata). For haystack, some pictures can be bundled together, and the files are unified using a single metadata (metadata). So how can you be sure to actually read the data from these files to the real need? Haystack uses a standalone index file to index the file system's data. So then, you can point to the desired picture file by the key in the index. Typically, 1GB of picture data requires a 1M metadata (metadata) memory cache in RAM. Because haystack can make sure that the index content resides in RAM, Facebook can ensure that the maximum number of IO times for each picture is 1 times.

Haystack Storage Mode

Haystack is represented on disk as a series of repeating blocks of data, containing two parts of the header (header) and data segment (Segment). The following figure is how Facebook uses haystack storage.

The following figure is the index in haystack:

Where start represents the starting address of the file in the Haystack file system, length is the file size.

New Storage Schema

If you quit using NetApp and instead use haystack, then Facebook's reading scene will change a little. Let's look at it again:

Write:

read:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More