Excerpt from "Baidu Encyclopedia".
NoSQL, which refers to non-relational databases. With the rise of internet web2.0 website, the traditional relational database in coping with web2.0 website, especially the web2.0 pure dynamic website of ultra-large-scale and high-concurrency SNS type, has been unable to overcome, exposing a lot of difficult problems, and the non-relational database has been developed very rapidly because of its own characteristics. NoSQL databases are created to address the challenges of multiple data types in large-scale data sets, especially big data application challenges.
Although the NoSQL buzzwords have been burning for just a year, there is no denying that the second generation movement has begun. Although the early stack code can only be considered an experiment, the present system is more mature and stable. But now there is a grim fact: technology is getting more mature--so much so that the good NoSQL data store has to be rewritten, and a few think it's called version 2.0. Here are some of the more well-known tools to build fast, extensible repositories for big data.Chinese nameNon-relational databaseForeign namesNosql=not only SQLFull nameNot only SQLcategoryNon-relational databaseApplication Areascomputer, software, database directory
-
1 Basic meanings
-
Four categories of 2NoSQL databases
-
Analysis of four classification forms of 3NoSQL database
-
4 Common characteristics
-
5 Applicable scenarios
-
6 Development status
-
7 challenges
1 Basic meanings NoSQL (NoSQL = not-only sql), meaning "not just SQL," is a new revolutionary movement in the database, which was raised early on, and the trend grew even more in the 2009. NoSQL advocates are advocating the use of non-relational data storage, which is undoubtedly a new kind of thinking injection, relative to the overwhelming use of relational databases.2Four categories of NoSQL databases
Key Value (
Key-value
) Storage DatabaseThis type of database primarily uses a hash table that has a specific key and a pointer to the specific data. The advantage of the Key/value model for IT systems is simplicity and ease of deployment. But if the DBA only queries or updates part of the value, Key/value becomes inefficient. [3] Examples include: Tokyo cabinet/tyrant, Redis, Voldemort, Oracle BDB.
The column stores the database. This part of the database is often used to deal with massive amounts of data for distributed storage. Keys still exist, but they are characterized by pointing to multiple columns. These columns are arranged by the column family. such as: Cassandra, HBase, Riak.
Document Type DatabaseThe document database is inspired by Lotus Notes Office software and is similar to the first key-value store. This type of data model is a versioned document, and semi-structured documents are stored in a specific format, such as JSON. A document database can be considered an upgraded version of a key-value database, allowing for the nesting of key values. and the document database is more efficient than the key-value database query. such as: CouchDB, MongoDb. Domestic also has the document type database SEQUOIADB, already open source.
Graph Database (graph)The graphical structure of the database is different from the other columns and the rigid structure of the SQL database, it is using a flexible graphical model, and can be extended to multiple servers. NoSQL databases do not have a standard query language (SQL), so database queries require a data model. Many NoSQL databases have rest-type data interfaces or query APIs. [2] such as: neo4j, Infogrid, Infinite Graph. Therefore, we summarize the NoSQL database in the following cases, the comparison is applicable: 1, the data model is relatively simple, 2, the need for more flexible IT systems, 3, the database performance requirements are high; 4, There is no need for a high degree of data consistency; 5. For a given key, it is easier to map a complex value environment.Analysis of four categories of 3 NoSQL databases
category |
Examples Example |
Typical application Scenarios |
Data Model |
Advantages |
Disadvantages |
Key value (Key-value) [3] |
Tokyo cabinet/tyrant, Redis, Voldemort, Oracle BDB |
Content caching, which is used primarily for high-access loads that handle large amounts of data, for some log systems, and so on. [3] |
Key-value pairs that point to value and are usually implemented with hash table [3] |
Fast Search Speed |
Data is unstructured and is usually used only as a string or binary data [3] |
Columnstore database [3] |
Cassandra, HBase, Riak |
Distributed File Systems |
To store the same column of data in a clustered type |
Find Fast, scalable, and easily distributed extensions |
function relative limitation |
Document type database [3] |
CouchDB, MongoDb |
Web applications (similar to Key-value, value is structured, but the database is able to understand the contents of value) |
Key-value corresponding key-value pairs, value is structured data |
Data structure requirements are not strict, table structure is variable, do not need to be like a relational database need to pre-defined table structure |
Query performance is not high, and the lack of uniform query syntax. |
Graph (graph) database [3] |
Neo4j, Infogrid, Infinite Graph |
Social networks, referral systems, and more. Focus on building a relationship map |
Graph structure |
Using graph structure correlation algorithm. such as shortest path addressing, N-degree relationship lookup, etc. |
Many times need to calculate the entire graph to get the information needed, and this structure is not very good for the distributed cluster scheme. [3] |
4 Common characteristics There is no clear scope and definition for NoSQL, but they all have common features:
- No predefined schemas are required: You do not need to define the data schema beforehand and predefine the table structure. Each record in the data may have different properties and formatting. When inserting data, it is not necessary to pre-define their patterns.
- No shared schema: a fully shared schema in the storage area network relative to all data storage. NoSQL often divides the data and stores it on each local server. Because the performance of reading data from a local disk tends to be better than the performance of reading data over a network, it improves the performance of the system.
- Elastic Extensibility: You can dynamically add or delete nodes while the system is running. No maintenance is required and data can be migrated automatically.
- Partitioning: Rather than storing data at the same node, a NoSQL database needs to partition the data and spread the records across multiple nodes. It is usually partitioned and replicated at the same time. This improves both parallel performance and guarantees that there is no single point of failure.
- Asynchronous replication: Unlike a RAID storage system, replication in NoSQL is often a log-based asynchronous replication. In this way, the data can be written to a node as soon as possible without being delayed by the network transmission. The disadvantage is that consistency is not always guaranteed, and a small amount of data may be lost in the event of a failure.
- Base: The NoSQL database guarantees the base attribute relative to the transaction-strict acid characteristics. Base is the final consistency and soft transaction.
NoSQL databases do not have a unified architecture, the difference between two NoSQL databases, or even far more than two kinds of relational databases. It can be said that NoSQL has its merits, and that successful NoSQL must be particularly useful in certain situations or applications where it is far better than relational databases and other NoSQL.5 applicable Scenarios NoSQL databases in the following cases are relatively applicable: 1, the data model is relatively simple, 2, the need for more flexible IT systems, 3, the database performance requirements are high, 4, does not require a high degree of data consistency, 5, for a given key, more easily map complex values of the environment.6 Development status Computer architecture requires a huge level of scalability in data storage, and NoSQL is committed to changing this situation. Google's BigTable and Amazon's Dynamo are using NoSQL databases. The names of NoSQL projects do not look the same, but they are usually the same in some ways: they can handle very large amounts of data. The revolution still needs to wait. True, NoSQL is not the mainstream for big businesses, but it's likely to change in a year or two. In the latest NoSQL event, 150 people from around the world are packed with a conference room in CBS Interactive. Share their experience of how to overthrow the tyranny of slow and expensive relational databases, and how to use more efficient and inexpensive ways to manage data. "The relational database imposes too many things on you. They want you to forcibly modify the object data to meet the needs of the RDBMS (relational database management system, relational databases management systems), and "For NoSQL advocates, a nosql-based alternative" just gives you what you need. " Horizontal Extensibility (horizontal scalability) refers to the ability to connect multiple hardware and software, so that multiple servers can be logically viewed as an entity.7 challenges While most NoSQL data storage systems have been deployed in real-world applications, there are many challenging questions about the current state of research.
- Key-value database products are mostly Built for specific application autonomy, lack of versatility;
- There are limited features supported by the product (no transactional features), which leads to some limitations in its application;
- There have been some research and improved NOSQL data storage systems, but they are the corresponding solutions for different application requirements, such as support for intra-group transactional features , elastic transaction, and so on, seldom consider the generality of the system from the whole world, nor form a serialized research result;
- Lacks a strong theory of a similar relational database ( such as the Armstrong Axiom System), technology (such as a mature heuristic-based optimization strategy, two-block protocol, etc.), support for standard specifications such as SQL language.
- Currently, the HBase database is one of the most comprehensive NoSQL database products in security features, While other NoSQL databases mostly do not provide built-in security, with the development of NoSQL, more and more people are beginning to realize the importance of security, some NoSQL products have gradually started to provide some security support.
With the development of cloud computing, Internet and other technologies, big data is widely available, and it also presents a lot of new applications in cloud environment, such as social network, mobile service, collaborative editing and so on. These new applications also put forward new requirements for massive data management or cloud data management systems, such as transaction support, System resiliency, and so on. At the same time, the design goals of the massive data management system in cloud computing era are extensibility, elasticity, fault tolerance, self-management and "strong consistency". At present, the system can satisfy the expansibility by supporting the arbitrary increment and decrement of the nodes, the fault tolerance of the system is ensured by the replica policy, and the self-management of the system is realized by the monitoring State message coordination. The goal of "elasticity" is to satisfy the Pay-per-use model to improve the utilization of system resources. The characteristic is imperfect of the typical NoSQL database system, but it is the typical characteristic of the cloud system. "Strong consistency" is mainly the demand of new application. [4]
-
- Resources
-
-
1. Family graph of NoSQL databases. TechTarget Business Intelligence. 2014-8-6 [citation date 2015-01-7].
2. Read the four families of NoSQL databases. Cloud-gen Storage [citation date 2014-11-27].
3. Four types of NoSQL databases. Sina Blog [citation date 2014-12-5].
4. Hugoo, Wang Yuete, Nietiao, et. A review of NoSQL systems supporting big Data management [J].
Reproduced About NoSQL