GFS is the cornerstone of Google's distributed storage. Other storage systems, such as Google's bigtable, External Store, and percolator, are directly or indirectly built on GFS.
System Architecture
GFS master
Metadata of the system is maintained, including file and Chunk namespaces, file-to-chunk ing relationships, and Chunk location information. global control of the entire system is replicated, the master periodically exchanges information with Cs through heartbeat.
GFS chunkserver (CS, data block server)
The 64 MB chunk block is allocated by the master when it is created with a 64-bit globally unique chunk handle. CS stores the chunk in the disk as a common Linux File.
GFS Client
The client is an access interface provided by GFS to applications. It is a set of dedicated interfaces provided in the form of library files. The GFS client does not cache file data and only caches metadata obtained from the master server.
Key Issues
Lease Mechanism
GFS data is traced and recorded in units. in the GFS system, the chunk write operation is authorized to the chunkmaster through the lease mechanism. The chunserver with the lease is the master chunkserver, And the chunkservcer with other copies is the backup chunkserver.
Consistency model-append consistency
Append process-the most complex part of GFS
GFS append process has two features: Pipeline and separation of data flow and Control Flow
Fault Tolerance Mechanism
Master stores three types of metadata: namespace, file-to-chunk ing, and Chunk Copy location information (1) Master Fault Tolerance: operation logs + checkpoint, and shadow master (2) chunkserver Fault Tolerance:
Master Design
The disk utilization of the chunkserver where the new copy is located is much lower than the average
Limit the number of chunkserver "Recent" instances
All copies of each chunkserver cannot be in the same rack
Master memory usage
Server Load balancer
Three cases of Chunk: Chunk creation, Chunk replication (re-replicantion), and load balancing (rebalancing)
Garbage Collection
GFS adopts a delayed deletion Mechanism
Snapshot)
Chunkserver Design
Chunkserver is a disk-and network-I/O-intensive application.
Summary
Differences between Linux GFS and Google GFS
GFS in Linux GFS is a global filesystem, which refers to a global file system. What we need to do is read and write shared storage. For example:
650) This. width = 650; "src =" http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Global_File_System/images/overview/fig-gfs-with-san.png "alt =" Linux <WBR> GFS and Google <WBR> GFS difference "/>
Linux GFS is similar to a concurrent lock processor: A node that obtains an exclusive lock can perform write operations.
Therefore, Linux GFS has two types of storage devices: iSCSI and FC.
GFS in Google GFS is Google filesystem, which refers to a distributed file system. The problem to be solved is the distribution of file systems.
In addition to management nodes, data nodes store specific data.
Summary:
Linux GFS is used to share expensive storage with everyone;
Google GFS is a collection of cheap local storage for everyone.
Http://www.90rhca.com /? P = 126
This article from the "from the Heart" blog, please be sure to keep this source http://fuquanjun.blog.51cto.com/5820068/1429839