NOSQL (iii) Distributed data Model

Last Update:2015-04-16 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

"NoSQL essence" Reading notes, reproduced please indicate the source "Jiq Technical Blog"

The main reason for the emergence of NoSQL is the need for a database that can run on a large cluster

Aggregation-oriented databases are well suited for scale-out cluster architectures, where aggregation naturally becomes a data distribution unit with two main paths to data distribution: "Replication (Replication)" and "Shard (sharding)", where replication copies the same data to multiple nodes. Sharding is the spread of data across different nodes.

1 Single Server

If you are using a NoSQL database primarily to handle aggregations, consider deploying a NoSQL database on a single server. In other words, the "single server" scenario is preferred in scenarios where the distribution of data is not needed.

2 Shards

If the database is busy: Different users need access to different parts of the data set. This allows for horizontal expansion by prioritizing shards and storing parts of the data on different servers.

idea: in order to distribute different users ' requests evenly to different servers, it is key to adopt the strategy of how to store the data. Aggregation is designed to put together data that is often accessed concurrently, so aggregations can be used as a distributed unit of data. How to distribute aggregated data evenly across different machines may sometimes require "domain-specific rules," and many NoSQL databases already provide "auto-sharding (auto-sharding)", which is responsible for distributing data to each shard by the database. and directs the data access request to the appropriate shard.

Pros: sharding is especially useful for boosting performance because it can improve both read and write efficiency , and replication technology, especially with cache replication, can greatly improve read performance, but it does little for scenarios that require frequent write operations. Sharding provides a way to expand the writing capability horizontally.

Disadvantage: sharding is not very helpful for raising the database "resiliency", as with "single Server", and may even reduce the database's error resilience.

3 Copy

3.1 master-slave replication

idea: There is a master node and multiple slave nodes in the master-slave structure, which replicates the data from the node and ensures that all the data from the node is synchronized with the master node. When reading, the data can be read from the primary node or from any node, even if the primary node is faulted, the slave node can still handle the data read request, and can reassign a slave node as the new master node, which can be easily expanded horizontally by adding the node. This will not only "significantly improve data read performance", but also "ensure the recovery of read operations." Write is worse, you need to request the master node for the update operation, and then by the master node to publish the data update request to the Slave node, on the one hand the performance is not high, not suitable for frequent write scenes, on the other hand, if the primary node error, the data update request can not be processed before recovery.

Advantages: handle write requests with high performance and fail-back capability. Even if you do not need to scale out, master-slave replication is also useful, and the primary node handles all read and write operations, and the node can act as an "instant backup."

disadvantage: on the one hand because the main node is the bottleneck and weakness of the system, resulting in write operation performance and failure recovery ability are not satisfactory. On the other hand there is a big flaw, that is, the inconsistency of data, because if the master node processes an update operation that has not been fully notified to all slave nodes, different clients may read different values from each other from the node.

3.2 Peer Copy

idea: in order to solve the main node in master-slave replication as the bottleneck and weakness of the system, discarding the concept of the master node, all nodes can accept the write request.

Advantages: This problem solves the bottleneck and weakness of primary node as write operation in master-slave replication structure.

Cons: still consistent, because two different nodes can handle write requests at the same time, a "write conflict" occurs when the same data is attempted to be updated simultaneously, and the read consistency issue also exists. Two extreme resolution of consistency: one is to reconcile each copy before it is actually written, to ensure that no conflicts occur, and to merge conflicting writes so that any copy can write data.

4 combination of sharding and replication

idea: The data is fragmented first, and then for each piece of data are "master-slave Replication" for maintenance, which means that there are multiple primary nodes in the system, for each data, the main node responsible for it only one. The "Column family database" is one such example.

NOSQL (iii) Distributed data Model

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

NOSQL (iii) Distributed data Model

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

NOSQL (iii) Distributed data Model

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support