NoSQL Database Overview
Characteristics
Mode freedom
You do not need to define a table structure, and each record in the data table may have different properties and formatting.
Inverse normalization
Do not follow the paradigm requirements, remove the integrity constraints, reduce the dependency between tables.
Multi-Partition Storage
Data is partitioned to spread the records across multiple nodes
Elastic Expandable
The node can be dynamically added and deleted during the system operation, and the data will be automatically balanced and moved.
Multiple copies
Data is quickly written to a node, and the rest of the nodes are read-write to the node log for asynchronous replication.
Soft business
The acid characteristics of the transaction cannot be fully satisfied and the final consistency of the transaction is ensured.
ACID:
relational databases Follow acid rules
Transactions are transaction in English, similar to real-world transactions, and have the following four features:
1, A (atomicity) atomicity
Atomicity is easy to understand, that is, all operations in a transaction are either done or not, and the transaction succeeds because all operations in the transaction are successful, and as long as one operation fails, the entire transaction fails and needs to be rolled back.
For example, bank transfer, transfer from a account 100 to B account, divided into two steps: 1) from a account 100 yuan, 2) deposited into the account of 100 to B. The two steps are either completed together, or not completed together, if only the first step, the second step fails, the money will be inexplicably less than 100 yuan.
2, C (consistency) consistency
Consistency is also relatively easy to understand, that is, the database should always be in a consistent state, the operation of the transaction will not change the original consistency of the database constraints.
For example, if an existing integrity constraint a+b=10, if a transaction changes A, then the B must be changed so that the transaction will still satisfy a+b=10, otherwise the transaction fails.
3, I (isolation) Independence
The so-called independence refers to the concurrent transactions do not affect each other, if one transaction to access the data is being modified by another transaction, as long as another transaction is not committed, the data it accesses is not affected by uncommitted transactions.
For example, now there is a transaction from a account transfer 100 to B account, in the case of this transaction is not completed, if at this time B query their own account, is not see the new increase of 100 yuan.
4, D (durability) Persistence
Persistence means that once a transaction commits, its modifications are persisted to the database, even if the outage occurs.
SQL vs NoSQL
Sql
- Highly organized structured data
- Structured Query Language (SQL) (SQL)
- Data and relationships are stored in separate tables.
- Data manipulation language, data definition language
- Strict consistency
- Basic transaction
Nosql
- Represents more than just a SQL
- No declarative query Language
- No pre-defined pattern
- Key-value pairs storage, column storage, document storage, graphics database
- Final consistency, not ACID properties
- Unstructured and unpredictable data
- Cap theorem
- High performance, highly available and scalable
A Brief History of NoSQL
The term NoSQL first appeared in 1998 and is a lightweight, open source, relational database that does not provide SQL functionality developed by Carlo Strozzi.
In 2009, Last.fm's Johan Oskarsson launched a discussion on a distributed open source database [2], and Eric Evans from Rackspace again proposed the concept of NoSQL, when NoSQL mainly refers to non-relational, distributed, does not provide an acid database design pattern.
The No:sql (East) symposium, held in Atlanta in 2009, was a milestone with the slogan "Select Fun, Profit from Real_world where Relational=false;". Therefore, the most common explanation for NoSQL is "non-associative", emphasizing the advantages of key-value stores and the documentation database, rather than simply opposing the RDBMS.
Cap theorem (Cap theorem)
In computer science, the cap theorem (Cap theorem), also known as the Brewer's theorem (Brewer's theorem), points out that it is impossible for a distributed computing system to meet the following three points:
- Consistency (consistency) (all nodes have the same data at the same time)
- Availability (availability) (Ensure that each request responds regardless of success or failure)
- Segregation tolerance (Partition tolerance) (loss or failure of any information in the system will not affect the continued operation of the system)
The core of the CAP theory is that a distributed system cannot meet the three requirements of consistency, availability, and partition fault tolerance at the same time, and can only satisfy two at the same time.
Therefore, according to the CAP principle, the NoSQL database is divided into three categories: satisfying the CA principle, satisfying CP principle and satisfying AP principle.
- CA-A single point of clustering, a system that meets consistency, availability, is often less scalable.
- CP-systems that meet consistency, partitioning tolerance, and generally performance are not particularly high.
- APS-systems that satisfy availability, partitioning tolerance, and generally may have a lower consistency requirement.
BASE
Base:basically Available, soft-state, eventually consistent. Defined by Eric Brewer.
The core of the CAP theory is that a distributed system cannot meet the three requirements of consistency, availability, and partition fault tolerance at the same time, and can only satisfy two at the same time.
Base is the weak requirement that NoSQL databases typically have for usability and consistency:
- Basically availble--Basic available
- Soft-state-Soft state/flexible transaction. "Soft state" can be understood as "no connection", while "hard state" is "connection oriented"
- Eventual consistency-final consistency is ultimately the ultimate goal of ACID.
ACID vs BASE
ACID |
BASE |
Atomicity (Atomicity) |
Basic available (Basically Available) |
Consistency (Consistency) |
Soft-state/flexible transactions (S-oft states) |
Isolation (Isolation) |
Final consistency (Eventual consistency) |
Persistence (Durable) |
|
What is atomicity?
As an example:
A want to transfer 1000 dollars from your account to B's account. The process of transferring money from a to the end of a transfer is called a transaction. In this transaction, do the following:
- Subtract 1000 dollars from the account of a. If A's account originally had 3000 dollars, it would now be 2000 dollars.
- Add 1000 dollars to the account of B. If B's account had 2000 dollars, it would now be 3000 dollars.
If the account of a has been reduced by 1000 dollars, suddenly an accident, such as a power outage or something, causing the transfer transaction unexpectedly terminated, and at this time B's account has not increased by 1000 yuan. Well, we call this operation a failure, to roll back. Rollback is to return to the state before the start of the transaction, that is, back to a account has not been reduced by 1000 blocks of state, B's account of the original state. At this point A's account still has 3000 pieces, B's account still has 2000 blocks.
We have succeeded in either together (a account successfully reduced by 1000, while the B account was successfully increased by 1000), or the operation that failed together (a account returned to its original state and B account back to its original state) was called an atomic operation.
If a transaction can be thought of as a program, it is either completely executed or not executed at all. This property is called atomicity.
NoSQL Database Classification
type |
section for |
features |
columnstore |
hbasecassandrahypertable |
As the name implies, stores data by column. The biggest feature is the convenient storage of structured and semi-structured data, easy to do data compression, for a column or a few columns of the query has a very large IO advantage. |
document store |
mongodbcouchdb |
document storage is typically stored in a JSON-like format, and the stored content is document-based. This also gives you the opportunity to index certain fields and implement certain functions of the relational database. |
key-value storage |
Tokyo Cabinet/tyrantberkeley Dbmemcachedbredis | td> can quickly query to its value via key. In general, the format of the store regardless of the value of the full receipt. (Redis contains additional features)
graph store |
neo4jflockdb |
graphical relationships for optimal storage. The use of traditional relational databases to address the performance of poor, and design use is not convenient. |
object store |
db4oversant |
manipulate the database through objects like object-oriented language and access the data by object. |
XML database |
Berkeley DB xmlbasex |
efficiently stores XML data and supports internal query syntax for XML, such as Xq Uery,xpath. |
NoSQL Database Related concepts