Introduction to the file system of the Linux cluster system

Source: Internet
Author: User

This article describes the cluster system file system, including coda, global, XFS, and mosix file systems.

The development of cluster computing requires the development and upgrade of the file system. This file system not only provides parallel access to multiple files, but also provides cache consistency between processes that access the same file. Most traditional network file systems such as NFS, AFS, and Coda are far from enough for parallel processing because they all depend on the central file server. However, as more and more customers join, the server's CPU quickly becomes a bottleneck in performance. To solve this problem, servers with higher processing capabilities have been created, and file system designers try to hand over more work to customers, but even so, the server speed is still the bottleneck of file system scalability. A new generation of file systems, such as Global
File System (GFS), XFS, and Frangipani are suitable for cluster systems. These systems allocate storage, cache, and control on machines in the cluster system, and provide solutions for parallel file access and cache consistency.

1. Coda File System

The coda file system is applicable to distributed network environments. It was developed in 1987 using afs2 as a prototype at Carnegie Mellon University. Linux virtual server uses the coda file system. Coda provides the following features for network file systems.

This service allows mobile customers to disconnect from the service.

It is a free software.

High Availability is provided through the continuous cache accessed by the customer.

Server replication.

Provides security models, encryption, and access control for authentication.

Some networks can continue to work after they become invalid.

It is adaptive to network bandwidth.

Good scalability.

A good syntax is defined for sharing even when the network fails.

Both the AFS and Coda file systems store all files in the same directory, for example, AFS is/AFS and CoDA is/coda, which means that all customers can use the same configuration, all users see the same file tree. This is very important for large installations. For the NFS file system, the customer needs the latest server list, while in coda, only the root directory/coda needs to be found.

When a request such as "cat/coda/tmp/foo" is typed on the client, cat calls the system call to request services from the core, the core first finds the corresponding file index node and returns the file handle related to the file. The index node contains some information about the file. The file handle is used to open the file. The system call first enters the core Virtual File System (VFS), and then it sends the request to the core coda file system module for processing. The coda file system module contains some recent requests from VFS, And then it sends the requests to the coda buffer manager Venus for processing. Venus locates the file location by checking the hard disk buffer and sending requests to the server. If no matching file is found in the hard disk buffer, a request is sent to the server through a remote system call and the obtained file is placed in
Cache, the file is a common file, so you can read and write the file through the local file system. If this file is found in the hard disk buffer, you can use it directly. When this file is modified and disabled, Venus will send the new file to the server to update the file on the server. Other operations, such as modifying the file system, creating new directories, deleting files, and removing symbolic links, can be transferred to the server.

However, due to network problems, it is very important to ensure file continuity. When Venus realizes that the server is unavailable, it stores updates to files on the client in the modification log. When the server is available again, then, the corresponding files on the server are updated according to the modification log.

2. Global File System

The Global File System (GFS) allows multiple Linux machines to share storage devices over the network. Each machine can regard a shared network disk as a local disk, and GFS itself appears as a local file system. If a machine performs some operations on a file, the machine accessing the file will read the written results. GFS file system usage 1 is shown in. :

3. XFS File System

XFS tries to provide low-latency and high-bandwidth access to file system data by distributing server functions such as maintaining cache consistency, locating data, and processing disk requests among various customers.

To maintain cache consistency, XFS adopts the following method. It regards all the customer's memory space as a large cache, which reduces the customer's data cache and uses the memory of idle machines, this cooperative cache can reduce the read Latency by reducing the number of requests sent to the disk.

To distribute the data locating function to each client, XFS enables each customer to process requests corresponding to a subset of files. File data is classified on multiple clients to provide higher bandwidth. These classified data includes some parity information, which can be used to restore classified data packets when the machine fails. This method ensures that no node will have a single point of failure.

4. mosix File System

The mosix cluster uses its own file system MFs file system. MFs regards all the file systems and directories in the cluster as a file system and provides unified access to all file systems on all nodes, it also ensures cache consistency by providing only one cache.

MFs contains many file subtree on different nodes, So it allows parallel operations on multiple files and cache consistency.

During Process Migration in a mosix cluster, if the process primarily occupies CPU resources, the migration process is very effective for providing system performance, however, if this process requires a large number of I/O operations, the migration process is very unfavorable. This is because each I/O operation needs to communicate with the original node of the process.

Therefore, MFs supports DFSA (direct File System acess. The purpose of DFSA is to migrate the processes that require a large number of I/O operations to the remote node. The remote node has the files that will be involved in most I/O operations, therefore, most I/O operations can be performed on the remote node, and data can be accessed through local access on the remote node. If a system call is node-independent, the system call will be executed on the remote node, otherwise it will be executed locally. MFs is superior to other network file systems in that it allows the use of local file systems, which reduces the communication overhead between processes and file servers.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.