Realization conception of p2pfs in P2P-based Distributed File System

Last Update:2014-08-18 Source: Internet

Author: User

Tags disk usage

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This log was first stored in Evernote, and was later written to the QQ mail notepad and sent to the tutor. It was published on the csdn blog today.

A Distributed File System

Introduction

This paper proposes a P2P-based Distributed File System. It uses the concept of bee colony (inspired by "out of control") to maximize the intelligence of a single node to realize the intelligence of group storage. It supports unlimited resizing, dynamic addition and deletion of nodes, automatic storage optimization, and strong Disaster Tolerance capabilities.

Main Ideas

The common file system requires a namenode to maintain the file list. The client first communicates with namenode to obtain the actual location of the accessed file, and then communicates with the data server to obtain the file content, here, the client and namenode remain in the communication state and need to get the file location to communicate with it. The problem with this system is that if namenode fails, the entire file system will also be paralyzed. The existing method is hot backup namenode, which sacrifices some performance and cannot guarantee the effect under special conditions.

Therefore, a decentralized distributed file system is composed of data servers. Each data server maintains its own files and directories, and supports the normal functions of the file system through frequent communication with other servers, similar to the global route forwarding system. The following describes how to implement different file system functions:

P2pfs example, blue indicates access request, Green indicates request forwarding, red indicates finding the file and executing File Communication

Read:

The client can access the cluster from any location. It only needs to submit a file access request to the Communication Data Server. The request only needs to contain the file name information and the client address, if the dataserver has file data on the local machine, it will directly send it to the client. If not, it will forward the access request. Other data servers will continue to find the local machine. If not, it will continue forwarding, before forwarding, communicate with the client to confirm that the client is still listening. If yes, initiate a connection with the client based on the request information to transmit data.

After the client receives the transfer request, it closes the listener and starts to transmit data. At this time, other data servers may still forward the request, but once the client has started the connection, it will not forward the request.

After the client initiates a file access request, it is equivalent to a server that can receive data from connections initiated by any dataserver. It sets a waiting time-out. If no data server is connected to the file after this time, the file does not exist and the client fails to read the file.

To optimize multiple searches. The client request has a parameter that records the node that was successfully accessed last time as a reference. If this field is appended, the host will be preferentially searched for during file access for faster search. At the same time, the accessed Server caches successful connections and becomes invalid after a certain period of time. The purpose of this operation is to save the trouble of searching multiple times if files are frequently accessed.

The data server forwards three hosts each time. After eight forwarding operations, more than 6500 hosts can be forwarded. However, you must consider how to avoid repeated requests.

Storage:

The client can access any data server to write files. First, send a write request. The data server writes the file directly to the local machine and sends a backup to different hosts on the uplink and downlink. The file description information is saved together with the storage location of the file backup, after at least three copies are written, the client returns successfully.

The data server regularly communicates with the server in the Data Server list that saves the backup of its own files. When a node fails, the data server broadcasts the failed node. Nodes that receive the failure notification, including the above nodes, send information about files with missing backups to other nodes and update the backup information.

Query:

When querying the file size, existence, owner, permissions, and other information in the file system, you only need to send a file request broadcast. The node that receives the request sends the file description information without returning the file data.

This file system is mainly used for access based on the file name. Folders can be implemented by adding separators to the file name. When querying the files in a folder, you need to use the wildcard method to return the file description information.

Modify:

After finding the file location using the same mechanism, the data server performs operations such as adding, deleting, modifying, and deleting the file, and adds or releases the file based on the Storage Distribution of the file (the disk space is eventually reclaimed by the stationed process ), when all the data servers that save the file backup return success information, the client returns success.

Delete:

First, broadcast the query file location, the dataserver that responds to the deletion file tag, and send the deletion notification based on the backup information of the file record. When the data server receives the first response to close the listener, when three responses are received, success is returned.

-- Server Load balancer

When the node traffic is large, select to only receive services that can be carried. The rejected file access requests are automatically forwarded to other backup nodes, and these nodes send data to the client.

Problem:

How can I load the access speed of frequently accessed files? Which includes frequent access by fixed clients and large-scale frequent access?

The client carries the dataserver information that was successfully accessed during access, and the server receiving the request can quickly locate it accordingly. To achieve quick access to single client

The list of successful connections cached by the egress node of the data server that receives the client. When a large number of clients access this data server entry, you can quickly locate

How do I perform scan, rename, space computing, and other operations required by standard file systems?

Wildcards are supported when accessing files. You can use this method to obtain the list of files in a folder. In addition, you only need to modify the file size, storage block address (node address), file name, and other information in the file header.

How can we meet the storage requirements of large and small files?

Large file. A large file is divided into multiple blocks with different locations. when accessing a file, you only need to find the node that stores the file header information. After the data is sent, the node notifies the next node that stores the data to continue sending, the principle is the same as the continuous access of non-contiguous addresses in the memory.

A small file is stored as a large block in the form of a long string. When you request access, locate the large file and save the file information through the large file header to send the file data to the client.

Implementation

Each data server runs the following processes independently:

Status Monitoring Process status: Monitoring node status

Monitors the disk usage, CPU load, network throughput, and temperature of dataserver. It is used to report exceptions in the body status and respond when the user queries the cluster status.

File Information maintenance process filecontrolers: maintenance of node file directory information

Respond to file access requests sent from other nodes and query and update the information of files stored on the local machine. It can be multiple processes. The file information stored by dataserver is stored in the memory and maintained through this process.

File data forwarding process filetransfers: transfers data

For file communication tasks confirmed by the file information maintenance process, such as file reading operations, the node sends the transmission task to the process, which initiates a connection with the client and transmits data to ensure data integrity, disconnect after transmission ends.

Heartbeat: detects invalid nodes.

Many files are saved on the node, and each file is backed up on different nodes. The process regularly communicates with the nodes that save the backup. When a node fails, find the missing backup file and transfer it to the new node recovery file.

Disk space management process diskmanager: manage node disk space

The files on the node are dynamically added and deleted, leaving discontinuous disk space on the disk space. When the system load is low, the process scans the files and sorts out the disk space.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More