Small file storage policy

Source: Internet
Author: User
Tags to domain

As the website grows and stores more and more things, it becomes a new challenge to solve these problems. It is not a good idea to completely store these files using a large hard disk, because the larger the data volume, the higher the risk. Although the files can be saved, the failure rate is correspondingly high, in addition, reconstruction takes a long time. Therefore, the best way is to consider distributed storage as much as possible and distribute files to multiple machines using networks.

According to the storage structure I know, distributed storage can be roughly divided into several types:

1. Distributed File System like googlefs

Because googlefs is currently not open-source, distributed file systems on the Internet are implemented by Google. The advantage of this solution is that the availability is relatively high, basically all hard disk-based applications can be processed, and the available range is relatively wide. I have read some introductions about GFS, gfs2, ocfs2, fastdfs, and mogilefs.

First, there are a lot of problems due to the relatively small number of documents. Then, none of them can be called stable versions. If they are available, it is estimated that some of them are charged. Because disk storage is critical, it is recommended that you do not deploy these items easily to important places. If you really want to use it, it is best to do a full test to ensure that its functions fully meet the needs; and then try to make a full backup in the traditional file system to avoid losses.

Another thing we can mention is memcached, which achieves distributed memory sharing and seems to be more stable than the above distributed file systems. It is completely memory-based. If the data volume is not large, try it.

2. Manually use the file path for decentralized Storage

This structure is usually used in Web static files. This is an example.

If the number of these files is large, You can distribute the file paths to specify the access to a specific file to a specific server or several servers. For example:

1) domain name dispersion Policy

For example. This policy transfers tasks of differentiated machines to the DNS server for execution, which is easy to scale up. This requires you to plan these items in the early stage of the web project, and the cost of switching to domain name policies in the later stage is relatively high or even not feasible.

2) Adopt the directory dispersion Policy

If the domain name does not plan to use the Domain Name Policy at the initial stage, you can use a proxy server to divide directories. For example, when a large number of files are stored, many levels of directories are divided according to certain rules due to restrictions and efficiency of the file system. Splitting machines by these directories is not difficult. The problem with this architecture lies in the performance and reliability of proxy servers, which requires a little effort.

The above two solutions require independent policies to implement distributed and synchronous transmission. transmission can generally be classified into two methods: Push and capture, for synchronization, you can use Log synchronization (to log the data to be synchronized, and transfer the corresponding files through log records) and comparison synchronization (using synchronization software such as rsync) or instant synchronization (transfer immediately after new modifications). To remove single point of failure (spof), first find a policy to store files on multiple nodes. For example, a.xxx.com or directory a files are also stored in Node B and node C. Then, the troubleshooting Technology (LVS or nginx) can be used in the environment to solve the problem. For example, if a domain name is used, you can use LVS, but the disadvantage is that the number of machines used will multiply. You can also use a level-1 proxy server. The disadvantage is that the performance will be sacrificed. If a directory is used, the proxy server is used, so it is easy to implement it as long as it is properly stored.

 

From: http://www.huohu666.cn/win/727.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.