First, Introduction
NoSQL (not only SQL), refers to the non-relational database. With the rise of internet web2.0 website, the traditional relational database in coping with web2.0 website, especially the web2.0 pure dynamic website of ultra-large-scale and high-concurrency SNS type, has been unable to overcome, exposing a lot of difficult problems, and the non-relational database has been developed very rapidly because of its own characteristics. NoSQL databases are created to address the challenges of multiple data types in large-scale data sets, especially big data application challenges.
Although the NoSQL buzzwords have been burning for just a year, there is no denying that the second generation movement has begun. Although the early stack code can only be considered an experiment, the present system is more mature and stable. But now there is a grim fact: technology is getting more mature--so much so that the good NoSQL data store has to be rewritten, and a few think it's called version 2.0. Here are some of the more well-known tools to build fast, extensible repositories for big data.
Second, four categories of NoSQL databases
1. Key value (Key-value) Storage Database
This type of database primarily uses a hash table that has a specific key and a pointer to the specific data. The advantage of the Key/value model for IT systems is simplicity and ease of deployment. But if the DBA only queries or updates part of the value, Key/value becomes inefficient. For example: Tokyo cabinet/tyrant, Redis, Voldemort, Oracle BDB.
2. Column Storage Database
This part of the database is often used to deal with massive amounts of data for distributed storage. Keys still exist, but they are characterized by pointing to multiple columns. These columns are arranged by the column family. such as: Cassandra, HBase, Riak.
3. Document Type Database
The document database is inspired by Lotus Notes Office software and is similar to the first key-value store. This type of data model is a versioned document, and semi-structured documents are stored in a specific format, such as JSON. A document database can be considered an upgraded version of a key-value database, allowing for the nesting of key values. and the document database is more efficient than the key-value database query. such as: CouchDB, MongoDb. Domestic also has the document type database SEQUOIADB, already open source.
4. Graph Database (graph)
The graphical structure of the database is different from the other columns and the rigid structure of the SQL database, it is using a flexible graphical model, and can be extended to multiple servers. NoSQL databases do not have a standard query language (SQL), so database queries require a data model. Many NoSQL databases have restful data interfaces or query APIs, such as: neo4j, Infogrid, Infinite Graph.
Therefore, we summarize the NoSQL database in the following cases, the comparison is applicable: 1, the data model is relatively simple, 2, the need for more flexible IT systems, 3, the database performance requirements are high, 4, does not require a high degree of data consistency, 5, for a given key, more easily map complex values of the environment.
Iii. Common Features
There is no clear scope and definition for NoSQL, but they all have common features:
No predefined schemas are required: You do not need to define the data schema beforehand and predefine the table structure. Each record in the data may have different properties and formatting. When inserting data, it is not necessary to pre-define their patterns.
No shared schema: a fully shared schema in the storage area network relative to all data storage. NoSQL often divides the data and stores it on each local server. Because the performance of reading data from a local disk tends to be better than the performance of reading data over a network, it improves the performance of the system.
Elastic Extensibility: You can dynamically add or delete nodes while the system is running. No maintenance is required and data can be migrated automatically.
Partitioning: Rather than storing data at the same node, a NoSQL database needs to partition the data and spread the records across multiple nodes. It is usually partitioned and replicated at the same time. This improves both parallel performance and guarantees that there is no single point of failure.
Asynchronous replication: Unlike a RAID storage system, replication in NoSQL is often a log-based asynchronous replication. In this way, the data can be written to a node as soon as possible without being delayed by the network transmission. The disadvantage is that consistency is not always guaranteed, and a small amount of data may be lost in the event of a failure.
Base: The NoSQL database guarantees the base attribute relative to the transaction-strict acid characteristics. Base is the final consistency and soft transaction.
NoSQL databases do not have a unified architecture, the difference between two NoSQL databases, or even far more than two kinds of relational databases. It can be said that NoSQL has its merits, and that successful NoSQL must be particularly useful in certain situations or applications where it is far better than relational databases and other NoSQL.
Applicable scenarios
NoSQL databases are more appropriate in these cases:
1, the data model is relatively simple;
2, the need for more flexible IT systems;
3, the database performance requirements are high;
4, do not need a high degree of data consistency;
5, for a given key, it is easier to map complex values of the environment.
Future and problems
While most NoSQL data storage systems have been deployed in real-world applications, there are many challenging questions about the current state of research.
The existing Key-value database products are mostly oriented to specific application self-government, and lack generality.
There are limited features supported by the product (transaction characteristics are not supported), which makes the application have some limitations.
There have been some research results and improved NOSQL data storage systems, but they are the corresponding solutions for different application needs, such as supporting the group transaction characteristics, elastic transactions, and so on, rarely from the global consideration of the generality of the system, nor the formation of a serialized research results;
The lack of strong theories like relational databases (such as the Armstrong Axiom System), technology (such as mature heuristic-based optimization strategies, two-block protocol, etc.), and support for standard specifications such as SQL language.
Currently, the HBase database is one of the most secure NoSQL database products, while other NoSQL databases mostly do not provide built-in security, but as NoSQL develops, more and more people are beginning to realize the importance of security, Some NoSQL products are beginning to provide some security support.
With the development of cloud computing, Internet and other technologies, big data is widely available, and it also presents a lot of new applications in cloud environment, such as social network, mobile service, collaborative editing and so on. These new applications also put forward new requirements for massive data management or cloud data management systems, such as transaction support, System resiliency, and so on. At the same time, the design goals of the massive data management system in cloud computing era are extensibility, elasticity, fault tolerance, self-management and "strong consistency". At present, the system can satisfy the expansibility by supporting the arbitrary increment and decrement of the nodes, the fault tolerance of the system is ensured by the replica policy, and the self-management of the system is realized by the monitoring State message coordination. The goal of "elasticity" is to satisfy the Pay-per-use model to improve the utilization of system resources. The characteristic is imperfect of the typical NoSQL database system, but it is the typical characteristic of the cloud system. "Strong consistency" is mainly the demand of new application.
Introduction to NoSQL Databases