Google File System Reading Notes

Source: Internet
Author: User

GFS is the cornerstone of Google's distributed storage. Other storage systems, such as Google's bigtable, External Store, and percolator, are directly or indirectly built on GFS.


  • System Architecture

    • GFS master
      Metadata of the system is maintained, including file and Chunk namespaces, file-to-chunk ing relationships, and Chunk location information. global control of the entire system is replicated, the master periodically exchanges information with Cs through heartbeat.

    • GFS chunkserver (CS, data block server)
      The 64 MB chunk block is allocated by the master when it is created with a 64-bit globally unique chunk handle. CS stores the chunk in the disk as a common Linux File.

    • GFS Client
      The client is an access interface provided by GFS to applications. It is a set of dedicated interfaces provided in the form of library files. The GFS client does not cache file data and only caches metadata obtained from the master server.

Key Issues

  • Lease Mechanism
    GFS data is traced and recorded in units. in the GFS system, the chunk write operation is authorized to the chunkmaster through the lease mechanism. The chunserver with the lease is the master chunkserver, And the chunkservcer with other copies is the backup chunkserver.

  • Consistency model-append consistency

  • Append process-the most complex part of GFS
    GFS append process has two features: Pipeline and separation of data flow and Control Flow

  • Fault Tolerance Mechanism
    Master stores three types of metadata: namespace, file-to-chunk ing, and Chunk Copy location information (1) Master Fault Tolerance: operation logs + checkpoint, and shadow master (2) chunkserver Fault Tolerance:


Master Design

  • The disk utilization of the chunkserver where the new copy is located is much lower than the average

  • Limit the number of chunkserver "Recent" instances

  • All copies of each chunkserver cannot be in the same rack

  • Master memory usage

  • Server Load balancer
    Three cases of Chunk: Chunk creation, Chunk replication (re-replicantion), and load balancing (rebalancing)

  • Garbage Collection
    GFS adopts a delayed deletion Mechanism

  • Snapshot)

Chunkserver Design

Chunkserver is a disk-and network-I/O-intensive application.


Summary

  • GFS is a system with good scalability and can automatically handle various exceptions at the software level.

  • The design of a single master is feasible.



Differences between Linux GFS and Google GFS

GFS in Linux GFS is a global filesystem, which refers to a global file system. What we need to do is read and write shared storage. For example:

650) This. width = 650; "src =" http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Global_File_System/images/overview/fig-gfs-with-san.png "alt =" Linux <WBR> GFS and Google <WBR> GFS difference "/>

Linux GFS is similar to a concurrent lock processor: A node that obtains an exclusive lock can perform write operations.

Therefore, Linux GFS has two types of storage devices: iSCSI and FC.


GFS in Google GFS is Google filesystem, which refers to a distributed file system. The problem to be solved is the distribution of file systems.

In addition to management nodes, data nodes store specific data.


Summary:

Linux GFS is used to share expensive storage with everyone;

Google GFS is a collection of cheap local storage for everyone.


Http://www.90rhca.com /? P = 126

This article from the "from the Heart" blog, please be sure to keep this source http://fuquanjun.blog.51cto.com/5820068/1429839

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.