The spanning of relational database to document type database

Source: Internet
Author: User
Tags unique id

1. Introduction

Before the advent of a document-type NoSQL database, many developers have been racking their brains to think of ways to better deal with relational database technology, and now they may want to jump out of that mindset. This article will introduce the differences between relational database and distributed document type database and some suggestions on application development.

2. Reasons for Change

People are usually reluctant to change, because change is always painful, unless it can solve some problems significantly. With the development of big data, it is more and more necessary for us to begin to make changes to the data model. In other words, the need for this shift is increasing, as the big data age requires very little flexibility for both the extended model of the database and the data Model .

2.1 Extending the Model

A relational database is a "scale-up" technology that requires a larger server to be replaced, regardless of data storage or I/O. The solution to modern application architectures is to use "scale-out"--no need to buy a larger server, just add a generic server, virtual machine, or cloud server under the load Balancer to scale. In addition, capacity can be easily reduced when it is no longer needed. In fact, the use of "scale-out" in the application logic layer is already extensive, but the database technology just caught up.

2.2 Data Model

The benefits of NoSQL "scale-out" deployment scenarios have been noted by the industry, but many people overlook the simplicity of NoSQL data management without the need for complex operational model building, which is just as important for database promotion as the extended model. When using a traditional relational database, you need to define the operating mode before adding data. Subsequent entries for each record need to be performed in a strictly defined mode of operation, such as fixed number of columns and data types. Therefore, it can be very troublesome to change the operation mode of those partitioned relational databases. If your data acquisition and data management needs change frequently, this rigorous pattern constraint will be a barrier to performance. NoSQL (document, column, K-v, and so on) are horizontally scaled , without the need to pre-define the mode of operation, so there is no need to change the mode of operation when demand changes. Next I'll use sequoiadb to introduce the document-based NoSQL database technology.

3. Data Model: Relational vs Document Type

It compares the storage of four records in relational and document-based data models:

3.1 Relational Data Model

As shown above, each record store in a relational database needs to adhere to a fixed pattern -the number of fixed columns, each of which has a specific meaning and specifies the data type. If you want to get different data, the schema of the database needs to be re-modified. Another feature of the relational model is "database normalization", where large tables are compressed into small, consolidated tables, as shown in:

In the example above, the database is used to store error log information. Each error record (one row in the left table) consists of 3 parts: the error number err, the time when the error occurred, and the data center DC where the error occurred. To avoid repeating data center information, each error record now points to the corresponding location in the right table (datacenter information). This does not require the actual storage of the specific DC information in the left table, in line with the traditional database paradigm. In a relational model, different records in multiple tables are often "interleaved", and some data is shared by multiple records. The advantage is that there is less duplication of data, but the downside is that once one of the links is changed, the records and tables associated with them are locked to prevent non-conformance . Acid transactions are complex in relational databases because the data is diffused. Even a single record, the existence of this complex network of shared data, makes the transfer of relational data between multiple servers complicated and slow, while making the performance of read and write operations worse and less conducive to distribution . When storage space is expensive and scarce, a tradeoff tradeoff is necessary. However, the price of storage today has fallen significantly compared to 40 years ago, and many times it is completely unnecessary to calculate a compromise. Using more storage space in exchange for better operational performance or allocating workloads to multiple machines is a better solution for today's applications.

3.2 Document-based data model

Using the word "document" seems strange, but in fact the "document Data Model" really does not have anything to do with the traditional meaning of the word "document". He is not a book, a letter or an article, the "document" is actually a data record that can "self-describe" the type and content of the data contained. XML documents, HTML documents, and JSON documents fall into this category. SEQUOIADB is a document-type database that uses JSON format, which stores data like this:

{  "_id ":{    "$oid ":"57b44b2b2b57085321000001"}, "Items ":[ { "Shopid ":8224, "Picture ":"Http://avatar.csdn.net/B/1/9/1_qq_16912651.jpg", "Amount ":1, "Price ":"117.59", "ItemName ":"Coffee", "Itemid ":194987}, {"Shopid ":9291, "Attribute ":[ { "Color ":"Blue", "Size ":"M"}, {"color ": " Pink "," size":  "M"}], " Picture ": " http://avatar.csdn.net/B/1/9/1_qq_16912651.jpg "," price ": " 17.63 "," itemname":  "t-shirt", " Itemid ": 543514}]," isactive ": Span class= "Hljs-value" >true, "uid": 123456}          
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21st
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34

As you can see, the data is irregular, and each record contains all the information about "sequoiadb" without any external references, and this record is " self-contained ". This makes it easy for records to move completely to other servers, because all of the information in this record is contained inside, and there is no need to consider that the information is not migrated along with other tables. At the same time, because during the move, only the record that is moved (document) needs to be manipulated rather than each linked table in the relational type needs to be locked to ensure consistency, so that the acid guarantee will become faster and the speed of reading and writing will be greatly improved.

4. Application of document-based data model

You may need to forget your old habits for a while, but don't be afraid, knowing the rest of the knowledge will allow you to make the most of what you've learned, no matter what the best way to solve the problem. Understand the different ways, you can choose the most suitable!

4.1 Models

In the application, the data object is the core part-the model layer in the Model View controller (MVC). When analyzing an application, you can now stop at the Object Relational Mapping layer (ORM). Instead of customizing different models into different tables and rows, store them in JSON format as a document, with each JSON document having a unique ID for easy lookup.

4.2 Keys

In a document database, the ID of each JSON document is its unique key, which is roughly equivalent to the primary key in the relational database. The ID is usually unique in a database "collection" (NoSQL, there are many types of taxonomy for "tables" like RDBMS, such as SEQUOIADB's collection collection or Couchbase buckets). Some NoSQL databases are sorted by ID, so the data of similar IDs is naturally more easily retrieved, and the data that is often called together can greatly increase the speed of processing.

4.3 flexibility

Today's social networking sites are becoming more popular, and as users continue to grow, each user is using different types of content. Someone will post a landscape photo, someone publishes a commentary on current events and someone shares music to express their feelings. In the face of such a large and diverse data, if you use a relational model, you need to constantly modify the data operation mode, which may cause a significant increase in system load, but also greatly increased processing time. At this point, the document-type model storage highlights its advantages, in the face of complex and changeable data, the use of document-type model directly retains the original appearance of the data, do not need to create a new table new operating mode to deal with, so not only the storage of direct and fast, and then after the call, you can also do " whole storage , you don't need a relational model to get the records that need to be displayed on the various linked tables. in an RDBMS, you need to standardize the data as much as possible. In NoSQL, it is possible to "standardize" the data.

4.4 Concurrency

Then the previous example, in the social network, the user's operation is very large, many people spend a lot of time every day to soak in social network. When using the traditional relational data model, for example, two users publish information at the same time linked to the "place", then one of the people back to modify their own release, because the link to the "Place" table, the system to ensure consistency will be the "place" table lock not allow other users to propose changes at the same time, At this point, another user temporarily has no way to operate the "place" table. If you use a document model, everyone's publication is a separate "document," and this document file contains all the information that is published in this article. Because of this " self-contained " nature, different users modify the data only by modifying their own documents without affecting the actions of others. This allows for a high level of concurrency!

5. Conclusion

The complex query operation of relational data model relies on the strict consistency of database schema, standardization of data and merging of data. Over the past 40 years, relational models and query technologies have matured and become familiar to many developers. However, the changes in application, user, and foundation characteristics have enabled application developers and architects to start choosing "NoSQL" as a non-relational database technology, and many argue that distributed document database technology trumps RDBMS in many ways:

    • It is easy to achieve near-unlimited horizontal scaling with normal machines, virtual machines, or cloud instances.
    • Adding data is that he doesn't need a strict database operation mode, so it's natural that you don't need to modify the database schema when modifying the data type.
    • A variety of data models can better support the modeling, storage and querying of complex data.
    • Although, data de-structuring may use more space, but as the price of storage space continues to decline, the proportion of storage space and read-write speed will be more and more like the pursuit of speed side tilt, and the resulting high performance, scalability and flexible data structure and other advantages will greatly improve the performance of all aspects of the application.

SEQUOIADB's data model is a document-based model stored in JSON format, so SEQUOIADB has the data flexibility and scalability of both document-based and NoSQL databases. SEQUOIADB's document Data model not only simplifies the process of data access, but also greatly improves the flexibility of data. In the application, it not only avoids the troublesome link of the design pattern, but also adapts well to the high concurrency, real-time and distributed requirements of the Big Data era.

The spanning of relational database to document type database

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.