Three methods of scale-out of relational database

Source: Internet
Author: User

This is the Oracle Coherence 3.5 book, Chapter One: Achieving performance, Scalability, and availability Objectives, section II: Achieving Scalability, the database extends the part of the reading notes.

Traditional relational databases are difficult to scale, usually vertically scaled, but scale horizontally only when they reach a certain level.

The scale-out of the database supports three methods, namely master-slave replication, clustering, and sharding (sharding).

Master-slave replication

Master-slave Replication (Master-slave replication), the easiest to configure, minimal application changes, and can reduce the burden of the main library.
The primary database can read and write from the database only. The most common scenario is to implement read-write separation, or business separation, that is, running reports, backups, data warehouses and other applications.

The problem with this approach is that the data between the master and the slave is not fully synchronized and may read to two different versions. Another problem is that if only the main library accepts read and write, the main library will be overloaded sooner or later, so it is not really a scale out.

However, the delay of the master-slave database data, some business is acceptable. In addition, the use of some real-time replication tools such as Goldengate, from the library can also be written, this can be used from the library to do other business, so as to achieve horizontal expansion. This is also a new trend of master-slave replication.

Cluster (clustering)

A cluster is also known as a shared everything or shared disk schema. The best known is the Oracle RAC.
1 databases can have multiple instances to access the databases on the shared storage.
Each node can read and write, and from an application point of view, the code does not need to change. Load balancing is also automatic.

Problems with clustering include:
* Data synchronization in memory is required, data acceleration leads to competition, and scalability is affected.
* Difficult to set up and manage
* Because the storage is shared, the read operation cannot be extended indefinitely

Clusters are suitable for read-intensive applications, such as data warehouses and BI.

Shard (sharding)

Partitions (Partition) are inside the library, and Shards (sharding) are outside the library, also called sub-tables, and are the schema of shared nothing.
Sharding is going to split a large library into many smaller libraries. How to split business rules, you can split by user ID, by business. If a join is required, the related tables need to be placed in a library to avoid communication between the databases.

Sharding can also have two methods, vertical partitioning and horizontal partitioning.

Vertical partitioning is divided by business, referred to as the sub-Library, that is, different businesses use different libraries, unrelated to each other. Vertical partitioning to a certain extent and cannot be extended, then horizontal partitioning is required.

Horizontal partitioning splits a large table into small tables, each of which is located in a different library. Each one builds the same schema. Partition according to the hash value of the primary key.

The sharding of the problem is:
* Increases the complexity of the application code and needs to be routed to the correct shard.
* Late increase Shard need to modify application logic and need to migrate data
* Query and aggregation (aggregation) is no longer simple and requires a cross-library federated operation
* Master data and reference data need to be copied to all Shard to avoid cross-library operations. Master and reference data are static, but once modified, there may be data consistency issues.
* Cross-Library modification requires distributed transaction processing, which limits scalability. Therefore should be avoided as far as possible.
* Failure of a single shard may make the entire system unusable (not necessarily). Therefore, it is often necessary to reconfigure HA scenarios for each shard, such as master-slave replication.

Despite these shortcomings, sharding is widely used for some large sites, such as Google, EBay, Facebook, and Flickr.

When the pain are great, any medicine this reduces it is good, regardless of the side effects.
This sentence is a bit of a meaning.

Of course, there are some other new database architectures that can be scaled out, such as the extension of NoSQL for OLTP, and Hadoop for OLAP. However, this is beyond the scope of this article.

Extended Reading
    • Scaling out SQL Server
    • Relational Databases is not designed for scale | MarkLogic
    • Scaling Up and Out | Dr Dobb ' s

Three methods of scale-out of relational database

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.