SEQUOIADB Series Four: A brief analysis of architecture

Source: Internet
Author: User

In the first installment of this series, the installation of SEQUOIADB and the deployment of a (pseudo) cluster are briefly described.

The second and third section of the SEQUOIADB cluster is simple to operate.

In this article, we will make a simple analysis of the architecture of the sequoiadb.

Because of their own limited ability, for the architecture of such a tall theme, dare not speak lightly. Therefore, this article will extract SEQUOIADB official description, coupled with their own understanding, to achieve the purpose of joint learning.

Before parsing, let's briefly describe the cap theory of a distributed system:

C: Represents the consistency, that is, at some point, the node data in the distributed system should be the same;

A: On behalf of the availability, that is, the demand, in a distributed system in a node, the other nodes within the unified cluster can still handle remote requests;

P: Represents partition tolerance, where a node in the system cannot connect ( malfunction ) after , other nodes in the same cluster can continue to provide services.

For some reasons, such as poor network environment, lack of disaster preparedness and so on, no one distributed product has all implemented the CAP. Therefore, the current distributed systems are in C and a to make a choice.

If you consider C, you must make multiple backups of the node, and when you write the data, the other backup nodes are written synchronously before they are returned. In case of large concurrency, it cannot be responded to in time, resulting in unavailability.

If you consider a, there is no guarantee that data (really) can be written to multiple backup nodes, and the data that the request to access the standby node may be data that is out of date.

SEQUOIADB provides the ultimate consistency of data in a large-scale distributed environment. and provide a way to meet the user's business scenarios, so that the database system has a certain degree of security, but also has a certain degree of availability.

    • The write operation is returned only after executing on some nodes, while the other backup nodes synchronize with the master node in the background, in this mode, high availability;
    • The write operation is performed on all nodes and then returned, in which case the data is more secure and satisfies the consistency.

Users can weigh their own application scenarios and set replsize values appropriately when creating Collection(collections).

The overall architecture diagram for SEQUOIADB is as follows:

The database system provides a set of coordination nodes for client programs to access. You can understand the proxy that becomes a database.

Since it is a proxy, the coordination node maintains only the client's connection, distributes the user request to the appropriate data node as a pull-distribution node, and does not save any user data.

After the coordination node, there are data node groups and catalog node groups.

The catalog node holds the metadata information for the system, such as several users of the database, several data sets, which CS in the data set, which CL, and some of the option values when the CL was created. The coordination node communicates with the cataloging node to understand the actual distribution of the data in the data node. One or more cataloging nodes can form a replication group cluster.

The user's data is saved in the data node. One or more data nodes can form a data node group (officially called a partition group, a replication group). Each data node in a data node group stores a complete set of data for that replication group, also known as a replication group instance (or a partition group instance), with final consistency synchronization between data nodes in the replication group, and no duplication of data saved in different replication groups.

A data node group, which must have a master node to handle write requests from users. In the scenario where higher availability is considered, the user's write operation is written on the primary node, and then through the synchronization mechanism, the standby node can obtain the user's operation and data from the master node, and update and store the data of the node. For a user's read operation, it can be processed from any node on the data node.

In fact, in a clustered environment, all requests for read, write, etc. are distributed from the Coord node ( except in the case of nodes in direct-attached data node groups ). It is described that the coordination node is omitted (not to believe?). Can look at the source of the PMD module and RTN module).

When the primary node in the cluster fails, after the communication mechanism in the cluster node is discovered, an election is initiated, a new master node is selected from the standby node, and the client's request continues to be accepted and processed. For the election section, you can search through the keyword "second election", which is described in more detail on Wikipedia:)

The election is conditional, and in sequoiadb, the election must be the number of all nodes except the failed node, which is more than half the data of all nodes in the cluster. Maybe I'm not describing it very accurately and I'm going to explain it in a formula.

Assuming that the number of nodes in a cluster is T (total), the number of nodes that failed is D (Dump), the election initiation condition is: true = = (t-d) > T/2. Of course, under normal circumstances, the primary node in the cluster will also spontaneously switch among the nodes, choosing the best node as the cluster master node.

When a node fails, the DBA processes the failure and recovers the failed node. At this point, the recovered node will act as a slave node, pull data from the master node, and do a backup of the data after it is started.

This is illustrated below:

SEQUOIADB The overall architecture is based on the above mechanism, but how to do so, you have to slowly analyze its source code. For more information, please visit SEQUOIADB website.

Alas, it took a long time to feel that it was unclear, and it was just a business structure that did not involve a technical aspect of the architecture.

After writing the advanced features of the previous article, I found that I did not write the SEQUOIADB's handling signs.

It's a real flop.

I'll find time to try out how the SEQUOIADB transaction function is used. It is said that MongoDB does not provide transactional functionality in the current version. It seems that, in fact, domestic software does not necessarily lose foreign software.

The humble place, please Haihan. Please note that there is a mistake in the story. Thank you for your patience to see here.

Next, may write a new article about the sequoiadb of the use of the function of the introduction, it may be the next topic, began to analyze the source of the SEQUOIADB section. Please pay attention.

PS: Speaking of domestic software, can not help but think of their own and friends with the computer, CPU, hard disk, memory, core accessories, incredibly are foreign. Godson after a burst of fire, also slowly sell hide. In the SOFTWARE product aspect, the game engine, the database and so on software, also mostly originates from abroad. In the case of nosql slowly warming up, to kill a can and MongoDB this industry boss competition SEQUOIADB database, as software practitioners, in fact, feel very proud. Sincerely hope that the country will appear more software products, for our people to win glory.

=====>the end<=====

SEQUOIADB Series Four: A brief analysis of architecture

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.