The NoSQL column has been applied for almost a year, and has not filled an article, see today, or write an article put it in. Now the number of people who apply NoSQL is also very many, we may no longer be unfamiliar, the Chinese information has been flying all over the sky. But there are few answers to the questions about NoSQL. It may be that everyone is paying more attention to the application of the relevant practical technology, ignoring the essence of the concept.
What is NoSQL?
Baidu Encyclopedia: NoSQL, refers to the non-relational database. Chinese name: non-relational database, foreign name: Nosql=not only SQL
look in Wikipedia: ANoSQL(originally referring to "Non SQL" or "Non relational")database provides a mechanism forStorage andRetrievalof data which is modeled in means other than the tabular relations used inrelational databases.
NoSQL (originally referred to as a "non-SQL" or "non-relational") database provides a mechanism for storing and retrieving data in a model, unlike the tabular relationships used in relational databases.
Look at the Ultimate NoSQL Guide (nosql-database.org) referenced in the wiki,which says: nosql DEFINITION:
next Generation Databases Mostly addressing some of the points:being non-relational, distributed, Open-source and horizontally Scalable .
The definition of NoSQL: The next generation of databases is mainly to address some of the main points: non-relational, distributed, open source and support scale -out.
The original intention has beenModern Web-scale databases. The movement began early and is growing rapidly. Often more characteristics apply such as:Schema-free, easy replication support, simple API, eventually consistent / BASE(not ACID), aHuge amount of dataand more . so the misleading term"NoSQL"(The community now translates it mostly with"Not only SQL") should be seen as a alias to something like the definition above.
The original intention is the modern network scale database.
The campaign began in early 2009 and is growing rapidly.
commonly supported features (common features), such as: No schema Open architecture (no pre-defined patterns required), easy replication, simple APIs, eventual consistency/foundation (no acid feature supported), support for massive data storage.
So, the misleading term "nosql" (now that society translates it mostly to "not just SQL") should be seen as an alias similar to the one defined above.
Past Life
Since NoSQL has been burning and growing fast in recent years, when did it start?
Such databases has existed since the late 1960s, but does not obtain the ' NoSQL ' moniker until a surge of popularity in th E early Twenty-first century.
Early on, the database has existed since the late 60, but it has not been dubbed "NoSQL."
It's just that the previous scenario is more suitable for relational databases, so NoSQL types of databases are not needed by most people and are not known by most people.
The term NoSQL first appeared in 1998 as a lightweight, open-source, SQL-enabled relational database developed by Carlo Strozzi (he argues that because NoSQL contradicts the traditional relational database model, it should have a brand new name, such as "Norel" or a similar name).
In 2009, Last.fm's Johan Oskarsson launched a discussion on the distributed open source database, and Eric Evans from Rackspace again proposed the concept of NoSQL, when NoSQL mainly refers to non-relational, distributed, does not provide an acid database design pattern.
The No:sql (East) symposium, held in Atlanta in 2009, was a milestone with the slogan "Select Fun, Profit from Real_world where Relational=false;". Therefore, the most common explanation for NoSQL is "non-relational", emphasizing the advantages of key-value storage and document databases, rather than simply opposing relational databases.
the cause of the birth
With the rise of internet web2.0 website, the traditional relational database in coping with web2.0 website, especially the web2.0 pure dynamic website of ultra-large-scale and high-concurrency SNS type, has been unable to overcome, exposing a lot of difficult problems, and the non-relational database has been developed very rapidly because of its own characteristics. NoSQL databases are created to address the challenges of multiple data types in large-scale data sets, especially big data application challenges.
four categories of NoSQL databases
key value (key-value) store databaseThis type of database primarily uses a hash table that has a specific key and a pointer to the specific data. The advantage of the Key/value model for IT systems is simplicity and ease of deployment. But if the DBA only queries or updates part of the value, Key/value becomes inefficient. [3] Examples include: Tokyo cabinet/tyrant, Redis, Voldemort, Oracle BDB.
The column stores the database. This part of the database is often used to deal with massive amounts of data for distributed storage. Keys still exist, but they are characterized by pointing to multiple columns. These columns are arranged by the column family. such as: Cassandra, HBase, Riak.
Document Type DatabaseThe document database is inspired by Lotus Notes Office software and is similar to the first key-value store. This type of data model is a versioned document, and semi-structured documents are stored in a specific format, such as JSON. A document database can be considered an upgraded version of a key-value database, allowing for the nesting of key values. and the document database is more efficient than the key-value database query. such as: CouchDB, MongoDb. Domestic also has the document type database SEQUOIADB, already open source.
Graph Database (graph)the graphical structure of the database is different from the other columns and the rigid structure of the SQL database, it is using a flexible graphical model, and can be extended to multiple servers. NoSQL databases do not have a standard query language (SQL), so database queries require a data model. Many NoSQL databases have rest-type data interfaces or query APIs. [2] such as: neo4j, Infogrid, Infinite Graph.
Therefore, we summarize the NoSQL database in the following cases, the comparison is applicable: 1, the data model is relatively simple, 2, the need for more flexible IT systems, 3, the database performance requirements are high, 4, does not require a high degree of data consistency, 5, for a given key, more easily map complex values of the environment.
comparative analysis of four classifications
Common features
• Simple data model. Unlike distributed databases, most NOSQL systems employ a simpler data model in which each record has a unique key, and the system only supports single-record-level atomicity and does not support foreign keys and cross-record relationships. This one-time constraint to get a single record greatly enhances the scalability of the system, and data operations can be performed on a single machine without the overhead of a distributed transaction.
• Separation of metadata and application data. NoSQL Data Management systems need to maintain two types of data: metadata and application data. Metadata is the mapping data that is used for system administration, such as data partitioning to nodes and replicas in a cluster. Application data is the business data that the user stores in the system. The system separates the two types of data because they have different consistency requirements. For the system to function properly, the metadata must be consistent and real-time, and the consistency requirements for application data will vary depending on the application. Therefore, to achieve scalability, NoSQL systems use different strategies for managing two types of data. There are also nosql systems that do not have metadata, and they solve the problem of mapping data and nodes in other ways.
• Weak consistency. NoSQL Systems achieve consistency by replicating application data. This design makes replica synchronization expensive when updating data, and in order to reduce this synchronization overhead, weak consistency models such as final consistency and timeline consistency are widely used.
With these technologies, NoSQL is well-prepared to meet the challenges of massive data. With respect to relational databases, the main advantages of the NOSQL data storage Management system are:
• Avoid unnecessary complexity. relational databases provide a wide variety of features and strong consistency, but many features can only be used in certain applications, most of which are rarely used. NoSQL systems offer less functionality to improve performance.
• High throughput. Some NoSQL data systems have a much higher throughput than traditional relational data management systems, as Google uses MapReduce to process 20PB of data stored in bigtable every day.
• High levels of scalability and low-end hardware clustering. NoSQL Data Systems can scale well, and unlike relational db clustering methods, this extension does not cost much. The design concept based on low-end hardware saves a lot of hardware overhead for users using nosql data systems.
• Avoid expensive object-relational mappings. many NoSQL systems can store data objects, which avoids the cost of transforming the object models in the relational models and programs in the database.
Main Disadvantages
Although the NoSQL database provides high scalability and flexibility, it has its own drawbacks, mainly:
• The data model and query language are not mathematically validated. SQL This query structure based on relational algebra and relational calculus has a solid mathematical guarantee, even if a structured query itself is complex, it can get all the data that satisfies the condition. Since none of the NoSQL systems use SQL, some of the models used do not have a perfect mathematical foundation. This is one of the main reasons why NoSQL systems are more chaotic.
• Acid properties are not supported. this has the advantage of NoSQL as well as its drawbacks, after all, the business is still needed in many situations, and the acid feature allows the system to perform accurately in the event of an outage.
ACID, which is an abbreviation for the four basic elements that the database transaction performs correctly. Contains: atomicity (atomicity), consistency (consistency), isolation (isolation), persistence (durability). A support transaction (Transaction) database, must have these four characteristics, otherwise in the transaction process (Transaction processing) can not guarantee the correctness of the data, the transaction process is very likely not to reach the requirements of the counterparty.
• Simple function. Most NoSQL systems offer simple functionality, which increases the burden on the application layer. For example, if you implement ACID properties at the application layer, programmers who write code must be extremely painful.
• No unified query model. NoSQL systems generally provide different query models, which in part adds to the burden on developers.
Conclusion
NoSQL may have been a gimmick at first, but as Web 2.0 lifted up, the demand for non-relational databases grew rapidly, and the associated databases sprang up rapidly, as opposed to relational databases or as a group on top of them. NoSQL debut.
Reference:
Baidu Encyclopedia Entry: NoSQL
Wikipedia:nosql
Big Data Management system: NoSQL database Past Life
The Ultimate NoSQL Guide ( nosql-database.org )
The introduction to NoSQL about NoSQL