Nosql learning notes (I) Overview

Last Update:2018-12-04 Source: Internet

Author: User

Tags cassandra

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Summary

Nosql databases are a challenge for traditional SQL databases. Due to the data size expansion of enterprises and Internet applications, SQL cannot support distributed storage and high-speed reading and writing of such massive data, so nosql came into being. Nosql improves database performance through a simple and efficient data storage method such as key-value.

2. Theory

Cap, base, and eventual consistency are three cornerstones of the existence of nosql databases. The three theories are described in detail below.

2.1cap Theory

C: consistency (synchronization of read/write data changes for multiple users)

A: Availability availability (Quick data acquisition)

P: tolerance of network partition fault tolerance (Distributed reliability)

The CAP theory was proposed by Professor Eric Brewer. The core of the CAP theory is that a distributed system cannot meet the consistency, availability and partition Fault Tolerance requirements, and can only meet two requirements at most.

See: http://www.julianbrowne.com/article/viewer/brewers-cap-theorem

2.2base Theory

Basicallyavailble basic availability (failed to support partition)

Soft-State Soft State/flexible transaction (stateless connection, supporting asynchronous)

Eventual consistency final consistency (do not require high consistency, only require eventual consistency)

The core of the base theory is to sacrifice high consistency to obtain availability or reliability.

See: http://www.jdon.com/jivejdon/thread/37625

2.3 eventual consistency Theory

(1) Strong Consistency

Strong Consistency (instant consistency) if a first writes a value to the storage system, the storage system ensures that the latest values will be returned for subsequent read operations A, B, and C.

(2) Weak Consistency

If a writes a value to the storage system, the storage system cannot guarantee that the read operations of A, B, and C can read the latest value. In this case, there is a concept of "inconsistent window", which refers to the period from when a writes a value to when A, B, and C reads the latest value.

(3) final consistency

Eventual consistency is a special case of weak consistency. If a writes a value to the storage system, the storage system ensures that the same value is not updated before reading A, B, and C, in the end, all read operations will read the latest value of the write. In this case, if no failure occurs, the size of the "inconsistency window" depends on the following factors: interaction latency, system load, and the number of replica in the replication technology (this can be understood as the number of salve in the master/salve mode). The most famous system in terms of final consistency is the DNS system, after updating the IP address of a domain name, all customers will see the latest value based on the Configuration Policy and Cache control policy.

See: http://www.allthingsdistributed.com/2008/12/eventually_consistent.html

3. Technology

3.1 distributed storage

(1) Master/Slave

Advantage: mature and stable

Disadvantages: single point of failure in write operations and slave latency

(2) Multi-Master

Advantage: Multiple masters solve single point of failure (spof)

Disadvantage: inconsistency is not easy to implement

(3) Two phase commit

Advantage: simple consistency Algorithms

Disadvantage: No Fault Tolerance

(4) three phase commit

Advantage: an agreement can be reached after a single point of failure occurs.

See: http://sebug.net/paper/databases/nosql/Nosql.html#_08464202471077442_91161458194

3.2 consistent hash

Consistent hash is a clever hash algorithm that is effective in solving the Load Balancing Problem of distributed systems.

See: http://www.cnblogs.com/leoo2sk/archive/2011/08/11/consistent-hashing-intro.html

3.3 quorum NRW

N: Number of copied nodes

R: Minimum number of nodes for successful read Operations

W: Minimum number of nodes for successful write operations

W + r> N, Strong Consistency

W + r <= N, final consistency

See: http://sebug.net/paper/databases/nosql/Nosql.html#NRW_012323816604251636_2127662_10272764961707637

3.4 vector clock

If W = 1 R = n, a complicated merge problem may occur. In this case, we can use the vector clock method. If the system does not require great flexibility, W = n can simplify the design.

See: http://en.wikipedia.org/wiki/Vector_clock

3.5 gossip

Virus-based transmission mode. Each node maintains a vector clock and a state version tree, which is currently being used by Cassandra.

For details, see:

Http://sebug.net/paper/databases/nosql/Nosql.html#gossip_34187653195112944_16061_08507828080528557

4. Mainstream nosql Products

(1) Big Table (Google)

(2) dyname (Amazon)

(3) hbase (APACHE)

(4) CASSANDRA (Facebook)

(5) couchdb (APACHE)

(6) MongoDB

(7) redis

(8) Riak

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More