Thesis Reading Notes-spanner: Google's sglobally-Distributed Database

Source: Internet
Author: User

Author: Liu Xuhui Raymond reprinted. Please indicate the source

Email: colorant at 163.com

Blog: http://blog.csdn.net/colorant/

More paper Reading Note http://blog.csdn.net/colorant/article/details/8256145


Keywords


Spanner,
External consistency,
Cross-data center, true time

 

=
Target question=

 

Provides a high-performance, global distributed synchronous backup database

 

=
Core Ideology=

 

Spanner is designed to support high-performance databases with hundreds of data centers and millions of servers. It focuses on providing high reliability and data consistency across data centers with high performance.

 

The spanner implements global consistency of data reading and writing in a timestamp-based manner. The key to efficient implementation in global databases lies in the underlying truetime.
API implementation.

 

Truetime API
Based on GPS and atomic clock, to ensure that the absolute time difference obtained by each server is within 1-7 milliseconds

 

Based on the precise timestamp provided by the truetime API, the spanner coordinates and manages the absolute time and submission order of the two-phase commit through the leader elected by paxos, thus ensuring data read/write consistency.

 

=
Implementation=

 

The deployment of a cluster in the spanner is called a universe. Each universe is composed of multiple zones. Each zone can be roughly analogous to a bigtable cluster. The zone contains a zonemaster to manage data distribution. hundreds or thousands of spanservers are responsible for actual data storage and query, and several locations
Proxy is used to route the client to a specific spanserver. Universemaster only monitors performance data. Each zone is a physical isolation unit, placement
The driver is responsible for data backup and migration between zones.

 

 

The internal data organization of the spanserver is similar to the tablet of bigtable, but it seems that it has nothing to do with bigtable. Each spanserver manages hundreds to thousands of tablets (including key-> value ing data similar to multiple versions). Each tablet architecture has a paxos state machine for collaborative concurrent operations. The underlying file system is Colossus (known as the next generation of GFS, and no relevant literature is found ...)


 

 

All write operations must be initiated by the leader, and read operations can be completed directly by servers whose data timestamps meet the update status. In cross-tablet operations, the leader of each paxosgroup works collaboratively.

=
Related research and projects=

 

The spanner's design goal is very similar to that of external store. The problem with external store is that the throughput of concurrent write operations may be poor. There is no detailed comparison of test data for similar applications, so I can only believe what spanner says. According to the rough principle, the reason why the spanner can do better is:

  1. More detailed paxos state machine (Tablet
    V. S. entity group) reduces the possibility of conflict
  1. The underlying architecture of External Store is hbase, with high communication overhead. The spanner directly manages the tablet, which simplifies the hierarchy.
  1. The support of true time API base for global consistency simplifies the implementation logic of concurrent read/write (this should be well understood)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.