Distributed Systems reading notes (12)-----Distributed File System

Source: Internet
Author: User

First, Introduction

A distributed system is essentially a program that can store and access remote files just like accessing local files, allowing access to any user on the network. In the following record, the main is the 2 large File system NFS and AFS do a detailed introduction and analysis.

1, the file system in the initial design is often based on the central node services to build, in the hub server has maintained a large number of file resources.

2, for the file system sub-block has the following sub-method: 1, Directory module. 2, File module. 3, Access control module. 4, file access module. 5, block file blocks module. 6, the device module, mainly refers to the disk IO, and the cache.

3, the role of file system mainly include: organization, storage, naming, sharing and protecting the role of the file.

4, a file includes data and attributes, and a directory is a special type of file, itself does not contain data, he provides a text name for a set of internal file identifiers mapping.

5, in a file system, in order to effectively manage all of the files inside, in the system's metadata will save some additional file information.

6, for a distributed file system, there are many requirements: 1, transparency, including access to transparent, extensibility and transparency. 2, the control of concurrent updates. 3, the file copy of the storage. 4, the heterogeneous nature of the hardware operating system. 5, fault tolerance. 6, data consistency issues. 7, security. 8, efficiency and performance aspects,

7. A file service structure typically consists of a client module and a server-side module.

II. Structure of document Services

A file service provides a clear breakdown of the internal component composition, including 3 component 1, pure file services. 2, Directory services. 3, the client module. 3 modules are interconnected, where file services and module services are placed in the server module.

1. Use Ufid in the file service to mark each unique document.

2. The directory service provides a mapping of name to a set of Ufid groups.

3, the client module as long as the server is defined by a series of interfaces and the form of interaction with the server.

4. Similar to a series of defined interfaces for a file, a set of interfaces is defined for the directory service.

5, in the early model, for the operation of the file interface, each time in the request for additional User ID authentication test, and later this way can be replaced by Kerberos authentication.

6, in the file system has the concept of a file group, is a collection of files, the file group is marked with 32-bit IP address + 16-bit date time to differentiate, if the pure IP address is problematic, because the filegroup can be moved to different machines.

Third, NFS file system

The NFS Chinese is called the network File system, and the computer can manipulate the file over the network.

1. In NFS, the client and server modules interact through RPC.

2, in the NFS, in order to achieve the transparency of the access, added the VFS virtual file system layer, under the virtual file system, there are corresponding to the implementation of each specific file system.

3. In NFS, the identifier for each file is in file Handles.

4, different from the traditional UNIX file system, NFS file system is stateless, he does not keep the client open file list, in the access control of the detection needs every refresh request, detect the user mark inside.

5. The local file mount in the NFS client can be mounted under a subtree of the server file system, but the conversion of the path name to the client is guaranteed to be consistent with the server.

6. File system mount is divided into soft mount and hard mount.

7, automatic Mount technology. When there is an empty mount point in the client, the server sends a message, and the first reply is mounted.

8, server-side cache policy, the traditional UNIX system is the main memory buffer space, the NFS server to take the policy is read ahead reading priority and Delayed-write delay write policy, in the client will maintain a series of client's most recently manipulated files results To avoid calling the client again.

9, the client's local cached data items, there is a certain expiration time, more than a certain time will be updated. The performance of NFS has been boosted by the caching of file blocks on the client side.

iv. AFS File System

The AFS file system is also a distributed file system. AFS's initial design goal was to serve large-scale users and nodes, and his high scalability was a major feature of the device. One of his most effective strategies is to cache hundreds of users ' most recently used files in the client.

1, AFS includes 2 important components, Venus and Vice, the former client, the latter in the server. The file service is implemented in Vice, the directory is implemented in Venu, and the file identifiers in AFS are represented by a 96-bit FID, similar to the Ufid of NFS.

2. The implementation of cache consistency in AFS is based on the Call-base callback method. When the server-side file has an update operation, he sends a request to all clients holding the copy of the file for the update operation.

Five, the development and improvement of distributed file System

Many requirements and performance have been implemented and improved in both AFS and NFS, but the development of distributed file systems has a long way to go, such as an AFS-based update operation in the form of a callback-like approach to NFS.


References: <<distributed sysytems Concepts and design>> original version fifth, Author:george Coulouris,jean Dollimore, Tim Kindberg,gordon Blair

Distributed Systems reading notes (12)-----Distributed File System

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.