From relational database to non-relational database

Source: Internet
Author: User
1. Relational database

A relational database is a database that uses a relational model to organize data.

The relational model was first proposed by IBM researcher Dr. E.f.codd in 1970, and in the following decades the concept of the relational model was fully developed and gradually became the mainstream model of the mainstream database structure.

In short, a relational model refers to a two-dimensional table model, and a relational database is a data organization composed of two-dimensional tables and their connections.

Common concepts in the relational model:

Relationship: Can be understood as a two-dimensional table, each relationship has a relationship name, it is usually called a table name tuple: It can be understood as a row in a two-dimensional table, often referred to as a record attribute in a database: it can be understood as a column in a two-dimensional table, often referred to as a field field in a database: The range of values for a property, The value limit keyword for a column in a database: A set of properties that uniquely identify a tuple, often called a primary key, and one or more columns that form a relational pattern: a description of the relationship. The format is: Relationship name (Property 1, Property 2, ..., property n), which becomes a table structure in the database

Advantages of relational databases:

Easy to understand: two-dimensional table structure is very close to the logical world of a concept, the relationship model relative to the network, level and other models are easier to understand the ease of use: Common SQL language makes operation of relational database easy to maintain: rich integrity (Entity integrity, referential integrity and user-defined integrity) Dramatically reduces the probability of data redundancy and inconsistent data

2. Relational database bottleneck

High concurrent read and write requirements

The user concurrency of the website is very high, often reach tens of thousands of times per second read and write requests, for traditional relational database, hard disk I/O is a big bottleneck

Efficient reading and writing of massive data

The amount of data generated by the Web site is enormous, and for relational databases, the query in a table containing massive amounts of data is very inefficient.

High scalability and availability

In a web-based architecture, databases are the hardest to scale horizontally, and when an application is growing in number of users and accesses, the database has no way to extend performance and load capabilities simply by adding more hardware and service nodes than Web server and app server. For many websites that need to provide 24-hour uninterrupted service, it is very painful to upgrade and expand the database system, which often requires downtime maintenance and data migration.


For Web sites, many of the features of relational databases are no longer needed:

Transactional consistency

Relational databases have a lot of overhead in maintaining the consistency of things, and now many web2.0 systems have low reading and writing consistency.

Reading realistic Time Sex

For relational databases, a query immediately after inserting a piece of data, is sure you can read this data, but for many web applications, does not require such a high real-time, such as sending a message, after a few seconds or even more than 10 seconds to see this dynamic is completely acceptable

Complex SQL, especially Multiple table association queries

Any large data web system, is very taboo on multiple large tables associated query, as well as complex data analysis types of complex SQL report query, especially SNS type of Web site, from the needs and product class angle, to avoid this situation. Often more just a single table of primary key query, as well as single table simple conditional paging query, the function of SQL greatly weakened


In relational databases, the most important causes of poor performance are multiple-table association queries and complex SQL report queries for complex data analysis types. In order to ensure the ACID properties of the database, we must try to design according to the paradigm that is required, and the tables in the relational database store a formatted data structure. The composition of each tuple field is the same, even if not every tuple requires all the fields, but the database allocates all the fields for each tuple, such that it facilitates linking between the banner tables, but it is also a factor in the performance bottlenecks of relational databases from another perspective. 3. NoSQL

The word NoSQL was first introduced by Carlo Strozzi in 1998, referring to a relational database that he developed without SQL, lightweight, and open source. This definition is very different from what we now define for NoSQL, and it does name, as its name implies, a database without SQL. But the development of NoSQL slowly deviated from the original intention, we want is not "no SQL", but "no relational", that is, we often say that the database of the non-relational.

In the early 2009, Johan Oskarsson held a discussion on open source distributed databases, and Eric Evans again presented the term NoSQL in this discussion to refer to data storage systems that are not relational, distributed, and generally not guaranteed to follow the acid principle. Eric Evans uses the word NoSQL not because of the literal "no SQL" meaning, he just feels that many classic relational database names are called "**sql", so in order to show that the relational database in the positioning of the different, is to use the word "NoSQL".

Note: Database transactions must have ACID properties, acid is atomic atomicity, consistency consistency, isolation isolation, durability persistence.


Non-relational database proposes another idea, for example, with a key value pair storage, and the structure is not fixed, each tuple can have different fields, each tuple can add some of their own key value pairs, so it will not be limited to a fixed structure, can reduce some time and space overhead. In this way, users can add the fields they need according to their needs, so that in order to obtain different information of the user, it is not necessary to query the multiple tables in the relational database. You can complete the query only if you need to retrieve the corresponding value based on the ID. But a relational database, with very few constraints, is not able to provide a query, such as where SQL provides, for field property values. And it is difficult to embody the integrity of the design. He is only suitable for storing some simpler data, and the SQL database is more appropriate for data that requires more complex queries.


4. Relational database V.s. Non-relational database

The most important feature of relational database is transactional consistency: the traditional relational database read and write operations are transactional and have acid characteristics, which makes the relational database can be used in almost all systems with conformance requirements, such as the typical banking system.

However, in Web applications, in particular, SNS applications, consistency is not so important, user a see content and User B see the same User C content update inconsistency is tolerable, or, two of people see the same friend data update time lag so a few seconds is tolerable, therefore, The biggest feature of a relational database is that it's no longer useful here, at least not so important.

Conversely, the huge cost of a relational database to maintain consistency is its poor read and write performance, SNS such as Weibo, Facebook and other applications, the ability to read and write in a very high demand, relational database has been unable to cope (in reading, traditionally in order to overcome relational database defects, improve performance, is to increase the level of memcache to static Web pages, and in SNS, changes too fast, Memchache has been powerless, therefore, a new data structure must be stored to replace the relational database.

Another characteristic of relational database is that it has fixed table structure, so its expansibility is very poor, but in SNS, the system upgrades, the function increase, often means the data structure huge change, this point relational database also difficult to cope with, need new structured data storage.

Therefore, the non-relational database came into being, because it is impossible to use a data structure of storage to meet all the new requirements, so, the relational database is strictly not a database, should be a data-structured storage method set.

It must be emphasized that the persistent storage of data, especially the persistent storage of massive data, or the need for a relational database this veteran.


5. Non-relational database classification

Because of the nature of the relational database itself, and the time is relatively short, therefore, do not want to relational database, there are several databases can unified Jiangshan, relational database is very large, and most of them are open source.

These databases, in fact, most of the implementation is relatively simple, in addition to some commonalities, a large part of the specific application requirements for some of the emergence, therefore, for such applications, has a very high performance. According to the structural method and the different application situations, mainly divided into the following categories:

Key-value database for high-performance concurrent Read and write:

The main feature of Key-value database is that even though it has high concurrent read and write performance, Redis,tokyo Cabinet,flare is a representative of this kind

Document-oriented database for mass data access:

This kind of database is characterized by the large number of data can be quickly query data, typically representative of MongoDB and COUCHDB

Distributed databases for scalability:

The problem that this kind of database wants to solve is that the traditional database has the scalability flaw, this kind of database can adapt the increment of the data quantity and the structure change




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.