Data Segmentation Overview

Source: Internet
Author: User
Tags one table
What is data segmentation, in simple terms, means The data stored in the same database is distributed to multiple databases (hosts) in order to achieve the effect of dispersing the load of a single device through a certain condition. Segmentation can disperse the load of a single device and improve the overall performance. But there are some problems with segmentation, introducing Distributed TransactionsCross-node joinCross-node sorting paging, and Multi-data source management。 The first few issues are for the service side, and the last one is for the client. These problems are easy to understand, the data in a database is divided into multiple nodes, then the corresponding is distributed transactions, Cross-border Point join, as well as sorting and paging and so on.      But for the client, also from a single data source into multiple data sources, in the operation also become more complex. The Data segmentation (Sharding) can be divided into two kinds of segmentation modes according to the type of the segmentation rule. one is to divide the data into different databases according to different tables, which can be called vertical segmentation; the other is According to the logical relationship of the data in the table, the data in the same table is divided into several databases according to some conditions, which is called the horizontal segmentation of the data.。      Concrete vertical segmentation and horizontal segmentation to the following specific introduction, here is not much to say. In terms of vertical segmentation, in fact, can also be divided into two forms, one is the database level of segmentation, which is mentioned above, according to different tables to the different database. There is also an understanding of vertical segmentation, table-level vertical segmentation, that is, a table of fields, separated into more than one table. These tables can be placed on the same node or on different nodes, and the sliced table is associated with the specified fields.      Of course, for table-level segmentation, if the database design at the beginning of the relatively perfect, there is no need, plainly, in fact, table-level vertical segmentation is a database design. There are often three words used here: partitioning, slicing, splitting. Partitioning is a database-supported mechanism that divides a physical file into multiple files and is a function provided by the database. Segmentation and fragmentation can be understood as a different translation of the same thing. But to be exact, I think it's a little different. Segmentation is the idea of spreading the data from the same database to a different database or host. Fragmentation, which can be thought of as a result of segmentation, each node or each database is a fragment. The above mentioned segmentation is divided into different databases or different hosts, of course, can also be divided into the current library, where the concept of multiple tenants involved. Many tenants have three levels, respectively
Standalone Database
Independent database security is highest, independence is best, backup recovery is simpler than arrogance, but the cost is highest.
Sharing databases, isolating data schemas
Share the database and isolate the data schema between the two.
Sharing databases, sharing data schemas
Shared databases, shared data schemas with the lowest isolation level, minimum security, and data backup and recovery difficulties when the cost is minimal.

How to achieve the data segmentation. Implements a variety of tools, I sum up three points,
The first is not using any tool, we manually to the client data segmentation operation, this is undoubtedly in the repetition of the wheel, and performance, stability is not guaranteed, and higher costs.
The second is the use of client components, since the mention is repeated to build the wheel, then there must be a corresponding component to deal with these things, through the components to deal with, can reduce costs, improve stability and security. The more representative open source component is the sharding-jdbc of Dangdang, the document is very detailed.
The third is through the database middleware to achieve, this way is the simplest and the best scalability. There are a number of mainstream database middleware, including the representative of Ali's amoeba, Qihoo 360 atlas, and open source organizations to provide mycat and so on. Each has its own advantages.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.