Selection notes for DFS Distributed File System and selection notes for dfs

Source: Internet
Author: User
Tags nameserver nginx server docker hub

Selection notes for DFS Distributed File System and selection notes for dfs

The requirements are prioritized as follows:

1) Stores Small and Medium files of 3 TB or more, and the images are dominant, with an average of 500 ~ 700 k, generally within 1 m.

2) Cluster-based, support for load balancing, high availability and high performance. Some large enterprises use the best endorsement.

3) provides a Java program to upload files. Java code can be debugged in Windows.

4) It must be open-source and updated by the author.

5) There are O & M monitoring tools to quickly locate the problematic servers.

6) (additional points) when adding a storage server, you do not need to change the Nginx Server Load balancer and Java program configurations.

 

I have read a lot of materials and have no perfect solutions. There are only three candidates:

Framework Introduction File Storage Method High Availability Capacity Expansion means Gzip support for browsers Browser cache Program access
FastDFS

I know that many Chinese startups are using it. I have used it for a while and it is relatively stable. Some netizens have written a lot of Chinese materials, but there are almost no official documents. We recommend that you use fastdfs-nginx-module instead of reverse proxy to directly access a storage server.

O & M: When access is unavailable, check logs distributed in multiple locations (nginx> nginx fastdfs module> storage> tracker). unfamiliar users may not be able to find the cause.

Home: https://github.com/happyfish100/fastdfs

Deployment method Description: http://blog.csdn.net/xifeijian/article/details/38567839

Docker: https://hub.docker.com/r/hhland/fastdfs/, https://hub.docker.com/r/season/fastdfs/

A good reference solution: https://github.com/daniellitoc/xultimate-resource

Key-value storage.

There is no upload directory concept, and the development/test environment requires independent deployment of self-owned file servers.

File List is not supported. FUSE is not supported.

A cluster consists of multiple trackers and multiple Storage servers.

There is no cluster between multiple trackers, and the client solves the failover problem.

The storage server is organized in groups. Different storage server files in the same group are identical. They are mainly used for load balancing and fault tolerance, similar to the hard disk raid 10 solution.

Multiple data centers are not supported.

TB-level storage solution

1. You can specify multiple store_path in the configuration of the same storage server to add hard disks.

2. You can increase the server capacity by group.

3. The total capacity is the sum of all groups.

Temporary compression by reverse proxy If-Modified-Since supported

Language SDK: supports Java through a dedicated SDK.

REST interface: None

Http File Reading: You can use the http service of the storage server or nginx with fastdfs-nginx-module installed. We recommend that you use the latter.

Baidu BFS
(To be studied)

Powerful functions, but few documents are available on the Internet. Baidu search has not found any useful articles. Description: Baidu is used by the entire company.

Home: https://github.com/baidu/bfs

Docker: Provides Dockerfile, but it is not placed in Docker Hub.

Directory storage.

Supports file list and FUSE

The cluster consists of NameServer, MetaServer, and ChunkServer.

NameServer uses the raft algorithm and selects the Leader based on Neuxs or Zookeeper. When the Leader fails, it automatically resends the Leader.

High Availability of the ChuckServer: to be analyzed

For multiple data centers, multi-server support is the best.

PB-level storage solution

 

To be analyzed To be analyzed

Language SDK: It is accessed through a dedicated SDK and does not support Java, but can be implemented through FUSE bridging. In Windows, it is estimated that Cygwin is required for access.

REST interface: None

Http Read File: NameServer provides access.

Seaweedfs

Powerful functions, it seems very promising, with few Chinese materials. The Doc says "zhongtong Express" is in use.

Since it has never been used, it is difficult to say whether it is convenient for O & M.

Home: https://github.com/chrislusf/seaweedfs

Deployment and use instructions: http://blog.chinaunix.net/uid-25057421-id-5676348.html

Official Docker: https://hub.docker.com/r/chrislusf/seaweedfs/

Key-value storage.

You can upload data to a specified directory. Therefore, the R & D/test environment can share the same file server.

Filer supports file lists, but does not support FUSE.

A cluster consists of multiple masters and Multiple volume servers. The replication behavior between volume is determined by the replication policy.

Supports multiple data centers and multiple replication policies.

PB-level storage solution

Increasing the capacity of a volume server is related to the replication policy.

Supports compressing files in gzip format by pre-compression to convert files into gzip files Etag and If-Modified-Since are supported.

Language SDK: Actually All sdks are accessed through the REST interface. Java version is available.

REST interface: the volume and filer servers provide different levels of interfaces. volume adopts the key-value Method and filer adopts a directory-like method.

Http File Reading: Provided by the filer Server

 

 

 

In summary, the seaweedfs ecosystem is quite complete and the author has been updating it. FastDFS is also a good choice.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.