Tair source code analysis _ comparison table creation process

Source: Internet
Author: User
Tags comparison table

Table creation in Tair is divided into five steps:


1)The number of buckets (including master and slave buckets) stored on each alive node is calculated based on the current m_hash_table table ). If there are nodes A, B, C, and D, the following map of data is formed, and whether the master node is down is determined during the scanning of the master node. If not, you need to perform Step 5 to quickly create a table:


Note:

When checking the list of available nodes, if the server Load balancer has a priority policy, the Group collects statistics on the list of currently available nodes and uses the list together with m_hash_table as the algorithm input. The only constraint on the list of available nodes is that all nodes in the list are alive. In the location priority policy, the constraints on the list of available nodes are enhanced, that is, except that each node in the list is alive, the difference in the number of dataservers in the two data centers must not exceed one threshold. (If the two constraints are not met, the table cannot be constructed normally under this policy)


For example, if there are two clusters and one cluster has 70 machines and one has 30 machines with a threshold of 0.5, the number difference in this example is (70-30) /7> 0.5 table creation failed because the policy is not met.


XA indicates the number of buckets stored on node A, XB, XC, and XD. Obviously, XA = XB = XC = XD = 0 at startup;

2)Create an index map based on the map generated in 1). That is, nodes holding the same number of buckets are stored in the same list. For example:


3)Allocate the number of buckets to each node. This step only determines how many buckets can be stored on each node, but not exactly which buckets are stored. This step is different for different table creation strategies.

In the process of assigning buckets to each node, assume that B buckets have been allocated to a node, and N nodes are now responsible for X buckets. If B <X/n, the node is responsible for B/n buckets. Otherwise, the node is responsible for B/n + 1 nodes. This aims to reduce data migration on the premise of Server Load balancer. Because data can be stored for multiple backups, copycount may be stored for a bucket. That is to say, the content of copycount buckets is identical, and the same backet, one of them is called the master bucket, and the other is called the slave bucket. It is obvious that the master bucket cannot be stored in an excessively concentrated State and must be evenly allocated to each node. Similarly, a map is constructed as follows:


30 and 31 in the second column indicate the total number of buckets to be stored by node.

Shows the process for calculating a bucket:


Note: In the location security priority policy, nodes are divided into two data centers based on the configuration mask pos_mask, and the number of available storage nodes in each data center is calculated. After two data centers are determined, the average number of load buckets (counted as B) for the storage nodes in each data center is determined in sequence, and the number of nodes with B buckets is determined, number of nodes with B + 1 bucket. Next, the number of buckets allocated to the storage node depends on the average number of buckets on the node where the storage node is located (the three data entries in each data center may be different, determine the number of buckets that the node loads.

4)Calculate the expected number of buckets that can be stored on each node. With 1 )~ 3) The data in step 2 can now calculate the number of buckets that each node can store.



5)Quickly create a table to determine the storage location of each bucket.


During initialization, the storage nodes are first arranged for the master bucket, and the columns of the master node in the comparison table are scanned by column in sequence. The columns of backup Node 1 and backup Node 2 are located.

Scan master node: when the master node is unavailable, upgrade the node where the backup bucket is located to the master node. If the node where the backup bucket is located is unavailable, table creation fails.

Scan backup node: When the backup node is unavailable, set it to 0.

If you do not need to perform step 6 after the quick table creation process is completed.

6) Determine whether the bucket is suitable for storage on the node.

This function is implemented by the get_suitable_node function, and is called by the is_this_node_ OK function. The get_suitable_node function is described here. This function is just like its name. Its function is to select an appropriate node. Although configserver provides two different table creation policies, the framework for selecting the appropriate node is the same. The main difference is that the constraints for determining whether a node is suitable for storing a specific bucket are different. Obviously, the constraints are different under different creation policies. Here we will describe the number of buckets that each node can store based on Step 4:


Select nodes from small to large based on the key value.

1. check whether a node is suitable for storing a bucket in the Server Load balancer policy as follows:

1) the number of master buckets stored on the node;

2) the total number of buckets stored on the node;

3) Constraints on the total number of nodes with the maximum number of buckets;

4) the bucket to be stored and its backup bucket cannot be on the same node.

The selection of suitable nodes is divided into four types of constraint levels. According to their constraints, they are defined as all, POs, base, and force. As shown in:


If the node constraints are not met, we will search for other nodes in sequence and select node E for verification in the previous example. If node e does not meet the criteria, we will search for the next node in sequence, modify the E data in the list.

2. This function is also implemented through various constraints under the location-first policy. The location security priority policy classifies all nodes into two categories by location. What is different from the load balancing policy is: first, when determining whether a node can store a bucket, you need to consider its location and category information, instead of the global information of Server Load balancer (for example, node N in Data Center A, the number of storage buckets is based on the average number of load buckets of all nodes in the machine room, instead of using all data centers as the benchmark). Secondly, the constraints for each constraint level are different. Then, no random disturbance factor is added. These constraints are classified as follows:

1) the number of master buckets stored on the node in the data center;

2) the total number of buckets stored on the node in the IDC;

3) The total number of nodes with the maximum number of buckets in the IDC;

4) the bucket to be stored and its backup bucket cannot be on the same node;

5) the bucket to be stored and its backup bucket cannot be in the same data center.

The selection of suitable nodes is divided into four types of constraint levels. According to their constraints, they are defined as all, POs, base, and force, as shown in:


From the data in the table, we can see that the Server Load balancer policy and location security policy are very different in determining whether a bucket is suitable for storing on a node.

With the above introduction, let's begin to introduce how the get_suitable_node function works. In fact, this function reduces the constraint level to select an appropriate node. For example, if you cannot select an appropriate node at the all level, you can lower the constraint level to the POs level until it reaches the force level. Under each level of constraints, traverse a node to store the map of the bucket capacity (this map is created in step 1 of the table creation process) and select a node.

Through the above six steps, we have initially established a Tair table.


Tair source code analysis _ comparison table creation process

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.