The Facebook picture storage System haystack--small files, essentially merging multiple small files into one large file to reduce the number of IO times, and the Meta data Memory offset

Source: Internet
Author: User
Tags posix server memory

transferred from: http://yanyiwu.com/work/2015/01/04/Haystack.html

A 14-page paper Facebook-haystack, after reading my impression of four words:

    • Because of "the drawbacks of traditional file systems"
    • Because "cache does not resolve long tail issues"
    • So "multiple picture information (Needle) exists in the same file (Superblock)"
    • So "significantly improved performance"
Drawbacks of traditional file systems

The traditional POSIX file system is not suitable for high-performance picture storage, the main reason is based on the file system to store, is to say that each image is stored in a directory of a file, each time you read a file needs to have n disk IO, when the number of directories is K-level is, It takes more than 10 file IO to read a file, even if the number of files in the directory is 0.1K, 3 times of File IO (1: Read directory metadata, 2: Read inode,3: Read file contents).

Cache does not resolve long tail issues

Application Scenarios for picture storage

There are some CDN escort before Photostorage, CDN is to rely on the cache to eat, for those popular images can be well cached by the CDN, so the need to access the photostorage is generally non-popular images, so in this scenario, in photo Storage improved caching obviously doesn't solve the problem. You know, the cache is basically nothing to do with the long tail problem. Because if the cache can solve the problem, it is not called the long tail problem.

Multiple picture information exists in the same file

Every time I read a picture requires multiple disk IO because a picture is stored in a file, the file system every time you read a file needs to read the file's meta-information, resulting in multiple disk IO, and when we have more than one picture information in the same file, of course, this file will be very large, Then in memory to store the image stored in the file offset address and image size, so each time you read the picture, according to the offset address to read directly read, most of the time can be done only one disk IO. Thus significantly improving performance.

Reprint Please specify source: Facebook picture Storage System Haystack

Based on this idea, haystack designers bypassed the POSIX file system, turning haystack into a KV FS, or Nofs. Each image corresponds to a fid, which is no longer stored separately in the file system, but the same physical volume Volume pictures are written to a file, maintained by Volume Server memory FID: <volume machine, Offset, size> The Volume Server maintains an open file handle in memory and only needs an IO sequential read operation to read the picture.


Haystack Frame composition

The architecture is relatively simple, divided into three parts: Haystack Directory, Haystack Cache, Haystack Store

Directory: The so-called Meta Server

1. Generate FID, maintain logical volume and physical volume mapping relationship, solve load balancing problem when uploading.

2. The newly added Store Server will be registered here.

3. Maintain the Read-only property of logical volume, read-only logical volume no longer accepts upload requests.

4. Decide whether to request a CDN or an internal Haystack Cache Server.

Cache: The so-called internal CDN

1. The image FID is saved by a consistent hash algorithm.

2. Cache only user requests, not requests from CDN.

3. Cache only write-enabled store images, due to the time sequence of uploading, the equivalent of caching only the latest generated images. For example, a user has just uploaded a picture that might be stored in the Cache and warmed up.

Store: Final Landing Storage Service

1. The picture order is appended to a large file, which maintains the index information of the Offset and Size of the picture in the file.

2. In order to resolve the restart fast load problem, the index information is saved to an index File separately.

The Facebook picture storage System haystack--small files, essentially merging multiple small files into one large file to reduce the number of IO times, the meta data Memory offset

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.