Distributed File System distributed filesystem __ Distributed File System

Source: Internet
Author: User

Reprint please indicate the source:

http://blog.csdn.net/c602273091/article/details/78598699

Storage system near the final exam, to prepare to review, this course Prof Greig speak very fascinated, need to tidy up.

Distributed File System probably: Basic client/server model application of client server model allocation server design issues to consider the basic client/server model

The top-level functions of the file system include: The transformation between "Name,offset" and "Partition,sector" (address space management, FTL); File caching and persistent preservation; consistency principles for some key data structures; access control, etc.

The relationship between file systems and storage media: storage media is attached to the file system, and the file system should be as small as possible with the storage media, abstracting interfaces as much as possible, similar to VFS.

Distributed File system facilitates: Sharing data: Data sharing between users and computers, multiple computers can share data, remote access, storage space and customer service to separate, support long-distance operation. Supervision: It is convenient to reallocate storage space between users. Management: Easier to backup when concentrated, reliability enhancement; disaster recovery (applying to mirror data)

Idea of Design:
When the client does more, the performance improves, facilitates the management, the server does not have to maintain the state of too many clients; When the server does more, the system is simpler, the system is more secure, the operation semantics are more simple, the data is more convenient to share and so on.

The simplest model: the server does everything. All operations of the client are done on the server, caching, file operation read and write, access control, file status, etc. everything is maintained by the server. When the customer request, the request will be packaged to send, indicating such as which client, which file, pointers and so on. The file representation format for RPC is not said to be in the transfer process here.

Using Vnode layer can run the file system at the same time, all the file systems inside kernel are done through the VFS. The mount allows you to mount a file system to the pathname, and unmount removes a file system. In a Distributed file system, the client mounts the file system of the server, and then accesses the server as easily as accessing the local file. Of course, every visit on the server will run Uid,nis,ldap and so on, and so on the access query.

Each time a client needs to know the file underneath a directory, it needs to invoke an RPC and send the server back to all directory names under that directory. The question NFS4. 2 is in the process of being resolved. a common Distributed File system model

The server takes care of everything: all the load on the server, security, simplicity is very high, but the server load is too large, and the server to maintain the state of all clients.

Sprite (client-side caching, server control) in order to improve performance, you can increase the caching of a client, which can reduce the process of RPC calls, reduce the load on the server, but cache consistency issues, such as the client data update, the server is not updated, other clients access to the server data is expired. The client tells the server when to update the data, the server gives the client caching permissions, and the client notifies the server if the client file is closed.

NFSv3 (stateless caching: Stateless caching) The client can update the data back to the server within 30d of writing, or wait for another client close to operate.

Afsorig: Caching and callbacks. When a file in the server is modified, a client is told to have the file backed up. After the client's cache changes, the entire data is written to the server.

File system must save the state of the customer service side. Therefore, this is divided into stateless and stateful servers according to this. Stateless running faster, resource consumption is small, simple, and crash recovery does not need to remember the client and server between the state to do recovery. Stateful may be quicker and can provide better semantic manipulation.

In a stateless server, how to use file handle as before the operation of files, this time the file system will return a handle structure to the client, the client through this and file operation. In order for data consistency to be preserved, an inode number is used in the given structure to indicate whether the handle has operational privileges.

To run the model:
Process based (Samba), thread based (NFS Server), and event-based (Node.js), which reduces the use of resources from left to right in turn. multi-server Storage

Just a server for the Distributed file system, then now add the server. Why increase it. For load balancing, as well as coordinate updates. What is this.

In general there are two situations: doing the same thing, just processing the data differently, doing different things, processing the same data.

Generally consider load balancing, avoid bottlenecks, avoid frequent communication between servers, and avoid making users too complex.

Case one: same function, different data. The most common situation at the moment. So how does a client find that server is serving it? Different file systems will have different answers. Load balancing (for file striping or pseudo-random distribution to increase bandwidth and reduce bottleneck)

In NFS, servers are serviced independently, and a customer service side will find the corresponding server based on Mount point, each with its own namespace. How to load balance: Give a file a sub-tree~ then access different files, you will go to a different server.

In AFS, the client first communicates with global AFS directory, and then the server finds the server that recently managed the file based on the Volumn ID.

In an HTTP request, the DNS is converted to an IP address, then the load is balanced, and the request is assigned to a different server.

Case two: Different functions, the same data.

Specific reference: design and implementation of 4.4 BSD (Chapter 9:network File system)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.