This is a creation in Article, where the information may have evolved or changed.
2012-12-31
Key-File Storage System Weedfs
Weedfs is a key-file storage system implemented with the go language, according to a Facebook paper.
In this paper, Facebook faces a huge amount of photo storage, the data feature is once written, read frequently, without modification, rarely deleted. Analysis based on POSIX system The main problem in this scenario is that the meta-information is stored on disk, and the read meta-information disk IO becomes a performance bottleneck-the first (possibly multiple) reads the file name into the I node, the second reads the I node, and the third reads the data.
Design goal:
- High-throughput low-latency meta-information is all stored in memory, avoiding multiple disk IO
- Fault tolerant
- Simple
Facebook's original design
The browser request is redirected to CDN,CDN if the image is cached and returned directly, otherwise the photo storage server is queried. Photo Storage server is done with NFS, they changed the kernel to do a file descriptor cache Open_by_filehandle, with memcache cache open File descriptor, avoid multiple read disk
The problem is that the caching effect is not good: there is a "long tail effect" in the frequency of the picture in the CDN, the cache hits only a part, the long tail consumes the large bandwidth photo storage server cache, the effect is not obvious. Even if there is a cache, you cannot change the nature of a POSIX read operation that requires multiple disk operations
Multiple photos stored in a large file, reducing the number of files
Reducing metadata information, putting metadata all in memory, like permissions, is unnecessary for the application scenario
According to volume, the physical volume on different machines is divided into logical volume,directory to maintain logic to physical mapping;
The cache function is the same as the original CDN, mainly caching and single point of failure, but the internal system;
Generate URLs such as Http://<CDN>/<Cache>/<Machine id>/<logical volume, photo>
Directory role:
- Logical physical mapping of volume
- Read/write load balancing for volume
- Mark as read-only when volume full
Cache is a distributed hash table implementation, the photo ID is key
Only requests from the user are cached, not from the CDN, because the CDN is miss and the internal cache hit is unlikely.
Only the readable volume is cached, because in the scenario, photo is generally accessed more when it is first passed