The working principle and flow of Fastdfs tracker storage

Source: Internet
Author: User

The working principle and flow of Fastdfs tracker storage

March 11, 2013 –09:22 | 1,409 views | Collection

(No ratings yet)

Fastdfsis an open source lightweight Distributed File system, which manages files, including: file storage, file synchronization, file access (file upload, file download), etc., to solve the problem of large capacity storage and load balancing. Especially suitable for file-based online services, such as photo album sites, video sites and so on.
The FASTDFS server has two roles:Tracking Device(tracker) andStorage Node(storage). The tracker mainly does the dispatching work, and the function of load balancing on the access.
Storage node storage file, complete all the functions of file management: storage, synchronization and provide access interface,FastdfsAt the same time, the metadata of the file is managed. The so-called meta data of a file is the relevant attribute of the file, denoted by a key value pair (key value pair), such as: width=1024, where the key is width,value to 1024. File metadata is a list of file attributes that can contain multiple key-value pairs.
Both trackers and storage nodes can be composed of one or more servers. Servers in trackers and storage nodes can be added or offline at any time without impacting online services. All servers in the tracker are peers, and can be increased or decreased at any time depending on the pressure of the server.
To support large capacity, storage nodes (servers) are organized in a sub-volume (or grouping) manner. The storage system consists of one or more volumes, and the files between the volumes and volumes are independent of each other, and the cumulative file capacity of all volumes is the file capacity of the entire storage system. A volume can consist of one or more storage servers, and the files in a storage server under a volume are the same, and multiple storage servers in the volume play a role of redundant backup and load balancing.
When adding servers to a volume, synchronizing existing files is done automatically by the system, and after synchronization is complete, the system automatically switches the new server to the online service.
You can add volumes dynamically when storage space is low or is about to be exhausted. You only need to add one or more servers and configure them as a new volume, which increases the capacity of the storage system.
The file ID in Fastdfs is divided into two parts: volume name and file name, both of which are indispensable.
Fastdfs File Upload
Upload file interaction process:
1. Client asked tracker upload to the storage, do not need additional parameters;
2. Tracker returns a usable storage;
3. Client directly and storage communication to complete the file upload.
Fastdfs File Download
To download the file interaction process:
1. The client asks tracker to download the file storage, the parameter is the file identification (volume name and file name);
2. Tracker returns a usable storage;
3. Client direct and storage communication to complete the file download.

group0/m00/00/02/cs8b8lfjiiyah841aaabpqt7xvi4715674

Group name: GROUP0 Disk: M00 directory: 00/02 file name: cs8b8lfjiiyah841aaabpqt7xvi4715674

The file name contains information that includes the following fields: Source storage Server Ip address file creation time file size file CRC32 validation code random number, BASE64 encoding

It is necessary to note that the client is the caller of the Fastdfs service, and the client should be a server, and its calls to tracker and storage are all server-to-machine calls.

Fastdfs synchronization mechanism Description:
Tracker server saves the storage group and the storage server under each group in memory and saves the connected storage server and its groupings to a file. To get storage related information directly from your local disk the next time you restart the service. Storage server records all the servers in the group in memory and logs server information to a file. Tracker server and Storage server synchronize storage server list with each other:

1. If a new storage is added within a group
The state of the server or storage server has changed, and tracker server synchronizes the storage server list to all storage servers within that group. Take the new storage server as an example, because the newly added storage
Server Active Connection Tracker Server,tracker server discovers that a new storage server join will return all storage servers in that group to the newly joined storage server. and re-return the list of storage servers for that group to other storage servers within the group;
2. If a new tracker Server,storage server is added to the tracker server, it is found that the storage server list returned by the tracker server is less than the native record, and the tracker Storage server that is not on the server is synchronized to the tracker server. Storage servers within the same group are peers, and file uploads, deletions, and so on can be performed on any of the storage servers. File synchronization takes place only between storage servers in the same group, using push, which is the source server that synchronizes to the target server. Take file uploads as an example, assuming that there are 3 storage servers A, B, and C in a group, file F is uploaded to Server B, and B synchronizes file F to the remaining two servers A and C. We may wish to upload the file F to Server B operation as the source operation, the F file on Server B is the source data, the file F is synchronized to server A and C operations for the backup operation, the F file on a and C is the backup data. The synchronization rules are summarized as follows:
1. Synchronize only between storage servers in this group;
2. The source data needs to be synchronized, the backup data does not need to be synchronized again, otherwise it constitutes a loop;
3. The exception to the second rule above is that when a new storage server is added, the existing storage server synchronizes all existing data (including source data and backup data) to the new server. Storage server has 7 states, as follows:
# Fdfs_storage_status_init: Initialization, not yet available source server for synchronizing existing data
# Fdfs_storage_status_wait_sync: Waiting for synchronization, the source server has been synchronized with the existing data
# fdfs_storage_status_syncing: In sync
# fdfs_storage_status_deleted: Deleted, the server is removed from this group (note: The functionality of this state has not yet been implemented)
# Fdfs_storage_status_offline: Offline
# Fdfs_storage_status_online: Online, no service available
# fdfs_storage_status_active: Online, can provide service
When storage
When the server's status is Fdfs_storage_status_online, when the STORAGE server initiates a heart beat to Tracker server, tracker server changes its state to Fdfs_ Storage_status_active.
When a new storage server A is added within the group, the system automatically completes the existing data synchronization and the processing logic is as follows:

1. Storage Server A connects tracker Server,tracker server to set storage Server A's status to Fdfs_storage_status_init. Storage Server A queries the source server for append synchronization and the append synchronization up to a point in time, if only storage server A or the number of files that have been successfully uploaded within the group is 0, no data needs to be synchronized, and storage Server A can provide online services. At this point tracker sets its state to Fdfs_storage_status_online, otherwise tracker
The server sets its state to Fdfs_storage_status_wait_sync and enters the second step of processing;
2. Suppose tracker
Server assigns a source storage server that synchronizes existing data to storage server a B. Storage server and Tracker server communications in the same group were informed that storage server A was added, the synchronization thread is started, and the tracker server is queried for the source server that is appending synchronization to storage server A and the up-to-date point. Storage Server B synchronizes all data before the point-in-time to storage server A, while the remaining storage servers synchronize normally from the point of time until the source data is synchronized to storage server A. By the end of the time, storage Server B's synchronization of storage server A will be switched from append to normal, synchronizing only the source data;
3. Storage Server
b synchronizes all data to storage server A, storage Server B requests tracker server to set storage Server A's status to Fdfs_storage_status_ when no data is temporarily synchronized ONLINE;
4 Tracker server changes its state to fdfs_storage_status_active when STORAGE Server A initiates heart beat to tracker server.

Related articles
  • March 11, 2013--Linux FASTDFS synchronization mechanism description (0)
    Fastdfs is a Google fs-like open source Distributed file system that is implemented in pure C to support UNIX systems such as Linux, FreeBSD, and Aix. It can only access files through proprietary APIs, does not support POSIX interfaces, and cannot be used on mount. To be precise, Google Fs and Fastdfs, MogileFS, HDFS, TFS and other classes of Google FS are not system-level Distributed file systems, but application-level Distributed file storage services. ...
  • April 18, 2013--Linux nginx Fastdfs module installation nginx and Fastdfs integration (41)
    Nginx is a high-performance HTTP and reverse proxy server, also a IMAP/POP3/SMTP proxy server. Nginx was developed by Igor Sysoev, the second-most visited rambler.ru site in Russia, which has already run more than 2.5 of the site. 1, download Nginx Fastdfs module http://fastdfs.googlecode.com/files/fastdfs-nginx-m ...
  • March 11, 2013--Linux FASTDFS configuration file parameter description (1)
    Fastdfs is a Google fs-like open source Distributed file system that is implemented in pure C to support UNIX systems such as Linux, FreeBSD, and Aix. It can only access files through proprietary APIs, does not support POSIX interfaces, and cannot be used on mount. To be precise, Google Fs and Fastdfs, MogileFS, HDFS, TFS and other classes of Google FS are not system-level Distributed file systems, but application-level Distributed file storage services. First of all...
  • March 13, 2013--Fastdfs fastdfs_tracker_list_groups parameter description (0)
    Configuration Description: A tracker under a two-day machine, ip:10.207.27.241 10.207.27.242 PHP call fastdfs_tracker_list_groups () function returns the result as follows: Parameter description: ip_addr: Machine IP join_time: Join time Up_time: Last boot time Http_domain: Access to the domain name VERSION:FASTDF ...
  • March 11, 2013--Linux FASTDFS Distributed File System installation (0)
    Fastdfs is a Google fs-like open source Distributed file system that is implemented in pure C to support UNIX systems such as Linux, FreeBSD, and Aix. It can only access files through proprietary APIs, does not support POSIX interfaces, and cannot be used on mount. To be precise, Google Fs and Fastdfs, MogileFS, HDFS, TFS and other classes of Google FS are not system-level Distributed file systems, but application-level Distributed file storage services. Fa ...
Zemanta

The working principle and flow of Fastdfs tracker storage

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.