Nosql learning notes (I) Overview

Source: Internet
Author: User
Tags cassandra

1. Summary

Nosql databases are a challenge for traditional SQL databases. Due to the data size expansion of enterprises and Internet applications, SQL cannot support distributed storage and high-speed reading and writing of such massive data, so nosql came into being. Nosql improves database performance through a simple and efficient data storage method such as key-value.

2. Theory

Cap, base, and eventual consistency are three cornerstones of the existence of nosql databases. The three theories are described in detail below.

2.1cap Theory

C: consistency (synchronization of read/write data changes for multiple users)

A: Availability availability (Quick data acquisition)

P: tolerance of network partition fault tolerance (Distributed reliability)

The CAP theory was proposed by Professor Eric Brewer. The core of the CAP theory is that a distributed system cannot meet the consistency, availability and partition Fault Tolerance requirements, and can only meet two requirements at most.

See: http://www.julianbrowne.com/article/viewer/brewers-cap-theorem

 

2.2base Theory

Basicallyavailble basic availability (failed to support partition)

Soft-State Soft State/flexible transaction (stateless connection, supporting asynchronous)

Eventual consistency final consistency (do not require high consistency, only require eventual consistency)

The core of the base theory is to sacrifice high consistency to obtain availability or reliability.

See: http://www.jdon.com/jivejdon/thread/37625

 

2.3 eventual consistency Theory

(1) Strong Consistency

Strong Consistency (instant consistency) if a first writes a value to the storage system, the storage system ensures that the latest values will be returned for subsequent read operations A, B, and C.

(2) Weak Consistency

If a writes a value to the storage system, the storage system cannot guarantee that the read operations of A, B, and C can read the latest value. In this case, there is a concept of "inconsistent window", which refers to the period from when a writes a value to when A, B, and C reads the latest value.

(3) final consistency

Eventual consistency is a special case of weak consistency. If a writes a value to the storage system, the storage system ensures that the same value is not updated before reading A, B, and C, in the end, all read operations will read the latest value of the write. In this case, if no failure occurs, the size of the "inconsistency window" depends on the following factors: interaction latency, system load, and the number of replica in the replication technology (this can be understood as the number of salve in the master/salve mode). The most famous system in terms of final consistency is the DNS system, after updating the IP address of a domain name, all customers will see the latest value based on the Configuration Policy and Cache control policy.

See: http://www.allthingsdistributed.com/2008/12/eventually_consistent.html

 

3. Technology

3.1 distributed storage

(1) Master/Slave

Advantage: mature and stable

Disadvantages: single point of failure in write operations and slave latency

(2) Multi-Master

Advantage: Multiple masters solve single point of failure (spof)

Disadvantage: inconsistency is not easy to implement

(3) Two phase commit

Advantage: simple consistency Algorithms

Disadvantage: No Fault Tolerance

(4) three phase commit

Advantage: an agreement can be reached after a single point of failure occurs.

See: http://sebug.net/paper/databases/nosql/Nosql.html#_08464202471077442_91161458194

 

3.2 consistent hash

Consistent hash is a clever hash algorithm that is effective in solving the Load Balancing Problem of distributed systems.

See: http://www.cnblogs.com/leoo2sk/archive/2011/08/11/consistent-hashing-intro.html

 

3.3 quorum NRW

N: Number of copied nodes

R: Minimum number of nodes for successful read Operations

W: Minimum number of nodes for successful write operations

W + r> N, Strong Consistency

W + r <= N, final consistency

See: http://sebug.net/paper/databases/nosql/Nosql.html#NRW_012323816604251636_2127662_10272764961707637

 

 

3.4 vector clock

If W = 1 R = n, a complicated merge problem may occur. In this case, we can use the vector clock method. If the system does not require great flexibility, W = n can simplify the design.

See: http://en.wikipedia.org/wiki/Vector_clock

 

3.5 gossip

Virus-based transmission mode. Each node maintains a vector clock and a state version tree, which is currently being used by Cassandra.

For details, see:

Http://sebug.net/paper/databases/nosql/Nosql.html#gossip_34187653195112944_16061_08507828080528557

 

4. Mainstream nosql Products

(1) Big Table (Google)

(2) dyname (Amazon)

(3) hbase (APACHE)

(4) CASSANDRA (Facebook)

(5) couchdb (APACHE)

(6) MongoDB

(7) redis

(8) Riak

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.