Region in swift 1.8.0 (grizzly)-> A globally distributed Cluster

Source: Internet
Author: User
Tags openstack swift

The new feature in swift 1.8.0 (grizzly) includes the following: added support for a region tier above zones, which allows developers to organize zones into a group for management, this group is region.

During the two days of viewing the source code of the ring part of swift g edition, we also found that the region attribute is added to the device dictionary. Therefore, we decided to carefully understand this hierarchical concept.

First, two articles are recommended.ArticleMy understanding of region has benefited a lot from these two articles:

Article 1Swift cluster is an important contributor to swift and a contributor to region features. Article: a globally distributed openstack swift cluster;

Article 2It not only introduces the region concept, but also describes the workflow of proxy server at the region level in detail. (This article needs to go over the wall -. -)

 

The ring without multi-regions

The standard swift ring is a data structure that allows you to divide storage devices into buckets or zones. Before introducing the ring that includes the regions concept, let's review the ring structure without adding the region concept version:

InEssexIn versions, the ring builder strictly ensures that different copies of the same object are distributed in different zones. Otherwise, the ring creation process cannot be completed. Therefore, swift developer must deploy at least the same number of zones as the number of copies in the cluster to ensure that the ring can be successfully created.

InFolsomIn the version, the structure of the ring file is modified and the structure of the ring balancingAlgorithmTo greatly improve the efficiency of ring creation. In this version, the policy of strictly guaranteeing the distribution of copies in different zones is replaced with another more flexible algorithm, which organizes zones, nodes, and devices, to form a tiers structure for allocation.

 

The ring with multi-tiers

BelowCodeIs the code (common. Ring. utils. py) that constructs the tiers structure in swift 1.7.6 (Folsom ).

 Def  Tiers_for_dev (Dev ):  """ Returns a tuple of tiers for a given device in ascending order by length.: Returns: tuple of tiers  """  T1 = Dev [ '  Zone  '  ] T2 = "  {IP }:{ port}  " . Format (IP = Dev. Get ( '  IP  ' ), Port = Dev. Get ( ' Port  '  ) T3 = Dev [ '  ID  '  ]  Return  (T1,), (T1, T2), (T1, T2, T3 )) 

In Folsom, The Ring balancer distributes copies of the same object to the nearest possible location (AS-far-as-possible or as-unique-as-possible ). Of course, the perfect solution is to place these copies in different zones ([Zone] tier), but if only one zone can be used, in this case, try to allocate copies to different nodes ([zone IP: Port] tier). Similarly, if only one node can be used, then try to allocate copies to different devices ([zone IP: port device] tier) on this node ).

This "as-unique-as-possible" algorithm has the potential to support geographically distributed cluster (each layer above increases the geographic range ), it can easily expand a small cluster into a large cluster. Therefore, we can add another region tier to this level to support the geographical distribution of clusters. A region is essentially a group of zones that share the same location. This group of zones can be a rack or a data center.

Under the "as-unique-as-possible" policy, the copy storage policy in swift varies with the scale. The green dot represents a copy, and the elliptical circle represents a disk andCubeBody represents a node...

 

The ring with multi-regions-> A globally distributed Cluster

Through the introduction of ring with multiple tiers structures, the concept of region is very easy to understand.

To create a global cluster,SwiftstackAdded the region concept for swift. Region extends the Tier level, which is a region larger than the zone, and a group of zones constitute a region.

A global cluster supporting replicas can be created by deploying storage nodes in different region. The latency between zones in the same region is relatively low. Proxy nodes will have an affinity (local affinity) for the region closest to it ), according to the region where the storage nodes is located, the data is written to the storage nodes of the latest region by optimistic writing. If needed, the client can use the option to perform a cross-region (ignore local affinity) read/write operation.

The following code constructs the tiers structure (common. Ring. utils. py) in swift 1.8.0 ).

 Def  Tiers_for_dev (Dev ):  """  Returns a tuple of tiers for a given device in ascending order by length.: Returns: tuple of tiers  """  T1 = Dev [ '  Region  ' ] T2 = Dev [ '  Zone  '  ] T3 = "  {IP }:{ port}  " . Format (IP = Dev. Get ( '  IP  ' ), Port = Dev. Get ( '  Port  '  ) T4 = Dev [ ' ID  '  ]  Return  (T1,), (T1, T2), (T1, T2, T3), (T1, T2, T3, T4 )) 

From the code above, we can clearly see that region is added to the top tier, and as a new property of the device Dev, it is implemented as a new key-value pair of the dev dictionary.

By default, the region of the SWIFT cluster is 1, which ensures that a region exists in the cluster and "()" indicates a region as the root of the tier tree. The following code is the building process of the tier tree. The doc string of the Code contains a clear description of the tier tree structure at the region level. For ease of understanding, I pasted it together.

 Def  Build_tier_tree (devices ):  """ Construct the tier tree from the zone layout. the tier tree is a dictionary that maps tiers to their child tiers. A synthetic root node of () is generated so that there's one tree, not a forest. example: Region 1-+ ---- Zone 1-+ ---- 192.168.101.1: 6000-+ ---- device ID 0 | + ---- device ID 1 | + ---- device ID 2 | + ---- 192.168.101.2: 6000-+ ---- device ID 3 | + ---- device ID 4 | + ---- device ID 5 | + ---- Zone 2-+ ---- 192.168.102.1: 6000-+ ---- device ID 6 | + ---- device ID 7 | + ---- device ID 8 | + ---- 192.168.102.2: 6000-+ ---- device ID 9 | + ---- device ID 10 Region 2-+ ---- Zone 1-+ ---- 192.168..1 .1: 6000-+ ---- device ID 12 | + ---- device ID 13 | + ---- device ID 14 | + ---- 192.168..2 .2: 6000-+ ---- device ID 15 | + ---- device ID 16 | + ---- device ID 17 the tier tree wocould look like: {(): [(1 ,), (2,)], (1,): [(1, 1), (1, 2)], (2,): [(2, 1)], (1, 1): [(1, 1, 192.168.101.1: 6000), (1, 1, 192.168.101.2: 6000)], (1, 2): [(1, 2, 192.168.102.1: 6000), (1, 2, 192.168.102.2: 6000)], (2, 1): [(2, 1, 192.168..1 .1: 6000), (2, 1, 192.168.201.2: 6000)], (1, 1, 192.168.101.1: 6000): [(1, 1, 192.168.101.1: 6000, 0), (1, 1, 192.168.101.1: 6000, 1 ), (1, 1, 192.168.101.1: 6000, 2)], (1, 1, 192.168.101.2: 6000): [(1, 1, 192.168.101.2: 6000, 3), (1, 1, 192.168.101.2: 6000, 4), (1, 1, 192.168.101.2: 6000, 5)], (1, 2, 192.168.102.1: 6000): [(1, 2, 192.168.102.1: 6000, 6), (1, 2, 192.168.102.1: 6000, 7), (1, 2, 192.168.102.1: 6000, 8)], (1, 2, 192.168.102.2: 6000 ): [(1, 2, 192.168.102.2: 6000, 9), (1, 2, 192.168.102.2: 6000, 10)], (2, 1, 192.168..1 .1: 6000): [(2, 1, 192.168..1 .1: 6000, 12), (2, 1, 192.168.201.1: 6000, 13), (2, 1, 192.168.201.1: 6000, 14)], (2, 1, 192.168..2 .2: 6000): [(2, 1, 192.168.201.2: 6000, 15), (2, 1, 192.168.201.2: 6000, 16), (2, 1, 192.168.201.2: 6000, 17)],}: Devices: Device dicts from which to generate the tree: Returns: Tier tree"""  Tier2children = Defaultdict (SET)  For Dev In  Devices:  For Tier In  Tiers_for_dev (Dev ):  If Len (Tier)> 1 : Tier2children [tier [0: -1 ]. Add (Tier)  Else : Tier2children [()]. Add (Tier)  Return Tier2children

 

For more details about the algorithms used for each operation in swift under multi-regions, refer to the two blogs links I provided at the beginning of the article, which have very detailed and illustrated descriptions, I will not duplicate the wheel = d

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.