Google Distributed System

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Google's search service needs to process and store massive amounts of data, and needs millions of search requests every day. It is a powerful distributed system. Let's take a look at Google's distributed system.

1. Distributed facilities

Three essential features for Distributed facilities: Distributed File System, distributed lock mechanism, and distributed communication mechanism. The distributed environment of Google is GFS, chubby, and Protocol buffer.

(1) GFS

GFS is mainly divided into two types of nodes. One is the master node, which stores non-data related to data files, rather than chunk (data block ). No data includes the location where a 64-bit tag can be mapped to the data block, the table that consists of the file, the location of the data block copy, and the process that is reading and writing specific data blocks.

In addition, the master node periodically receives updates (heart-beat) from each chunk node, keeping metadata up-to-date.

The second is the chunk node, which is mainly used to store data. On each chunk node, data files are stored in 64 MB mode by default for each chunk, and each chunk has a unique 64-bit tag, will be replicated multiple times in the distributed system. The default number is 3. GFS Architecture

(2) Chubby

In short, Chubby is a Distributed Lock service. With chubby, thousands of clients in a distributed system can "Lock" or "unlock" a resource ". It is often used for collaboration within systems such as bigtable and mapreduce. In terms of implementation, it implements "locking" by creating files, and uses the famous paxosAlgorithm.

As for the implementation mechanism, Chubby is a distributed file system that provides some mechanisms for the client to create files and perform some basic operations on the chubby service.

So how does chubby implement the "Lock" function? The chubby lock is a file. Creating a file is actually "locking" the operation. The server that successfully creates the file is actually grabbing the "lock ". You can open, close, and read files to obtain shared or exclusive locks and send updates to users through communication mechanisms.

As shown in, a chubby cluster consists of five machines, each of which has a copy, one of which will be selected as the master node. Replicas are equivalent to each other in terms of structure and capabilities. They use the paxos Protocol to maintain log consistency. They may be offline and then relaunched. After going online again, you need to maintain data consistency with other nodes. The client uses the chubby client library for access.

Why is a lock service used to solve the consistency problem instead of implementing a paxos-like algorithm protocol? This solution has the following five benefits.

A. Most developers do not consider this consistency issue when developing services at the beginning, so consistency protocols are not used at the beginning. Only when the service is gradually mature can we take this issue seriously and adopt the lock service to keep the originalProgramIn the case of architecture and communication mechanism, a simple statement is added to solve the consistency problem.

B. In many cases, it is not only as simple as selecting a master node, but also the address of the master node to others or to save a certain information. In this case, the chubby file not only provides the lock function, but also records useful information (such as the Master Address) in the file ). Therefore, many developers use Chubby to save metadata and configuration.

C. A lock-based development interface is more familiar to developers. Not all developers understand consistency protocols, but most of them should be locked.

D. Generally, common consistency protocols require several pairs to ensure high availability. In this regard, the paxos algorithm is the most obvious example. With chubby, only one client can be used.

E. Use the lock service because chubby not only solves the consistency problem, but also wants to provide more and more useful functions. In fact, many Google developers use chubby as a naming service, which is very effective.

(3) potolcol Buffer

Potolcol buffer is a language-neutral, platform-neutral, and scalable method used internally by Google to serialize structured data. It provides Java-based, the implementation of C ++ and Python (each implementation includes the compiler and library files of the corresponding language), and it is a binary format, therefore, the speed is about 10 times faster than that of data exchange using XML. It is mainly used for two aspects: RPC (Remote Procedure Call) Communication, which can be used for communication between distributed applications or heterogeneous environments.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Google Distributed System

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Google Distributed System

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support