Primary key solution for databases in a distributed environment

Primary key solution for databases in a distributed environment _ MySQL

Last Update:2018-04-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Database primary key scheme in distributed environment bitsCN.com

Primary key solution for databases in distributed environments

When mysql is applied, the primary key id usually increases automatically. how is this efficiency? In a cluster without a master-slave architecture, this is usually the most convenient ,?

But in the cluster and master-slave architecture environment, without such implementation, how can we implement a unique id?

When you only use a single database, using an auto-incrementing primary key ID is undoubtedly the most suitable.

However, there are some problems in the cluster and master-slave architecture, such as the global uniqueness of the primary key.

This section describes how to create a primary key in the cluster environment except for the auto-increment ID.

1. generate a GUID through the application and insert the GUID together with the data into the split cluster. It is easy to maintain and implement. The disadvantage is that the application computing cost is large, and the GUID length is relatively long, which occupies a large database storage space and involves application development.

Note: The primary advantage is simplicity. The disadvantage is that storage space is wasted.

2. an independent application generates a series of unique IDs in the database in advance. each application reads data through an interface or by itself and inserts the data into the sharded cluster together. The advantage is that the globally unique primary key is simple and easy to maintain. The disadvantage is that the implementation is complex and requires application development.

Note: ID tables need to be frequently queried and updated, which affects performance when data is inserted.

3. use the auto-increment type of the database itself (such as the auto_increment field of MySQL) or the auto-increment object (such as the Sequence of Oracle) through the central database server) and insert it into the sharded cluster together with the data. What are the advantages? It seems that there are no obvious advantages. The disadvantage is that the implementation is complicated, and the overall availability is maintained on this central database server. Once crash occurs, all clusters cannot be inserted, involving application development.

Note: not recommended.

4. add the cluster ID and auto-increment (auto_increment type) fields to form a unique primary key. It is easy to implement, easy to maintain, and transparent to applications. The disadvantage is that the reference Association operation is relatively complicated and requires two fields. the primary key occupies a large amount of space. this side effect is obvious when InnoDB is used.

Note: Although there are two fields, this method has the smallest storage space and only one smallint is added.

5. set the auto-increment ID start point (auto_increment_offset) for each cluster and segment the IDs of each cluster to be globally unique. When the data of a cluster grows too fast, use the command to adjust the starting position of the next ID to skip the possible conflicts. The advantage is that the implementation is simple, and it is easier to determine the cluster where the data is located based on the ID size, which is transparent to the application. The disadvantage is that the maintenance is relatively complex, so you need to pay close attention to the growth of each cluster ID.

Note: the adjustment is too troublesome because the segment is full.

6. set the auto-increment ID start point (auto_increment_offset) and ID auto-increment step (auto_increment_increment) for each cluster to stagger the start point of each cluster, step size is greater than the number of sharding clusters that are basically impossible in the future to achieve the effect of ID segmentation to meet the globally unique effect. It is easy to implement, easy to maintain in the future, and transparent to applications. The disadvantage is that the first setting is relatively complicated.

Note: Multiple solutions are required to avoid overlap.

Problems caused by using UUID as the primary key:

For engines such as InnoDB that aggregate primary key types, data is sorted by primary key. due to the UUID's no sequence, InnoDB will generate a huge IO pressure, which is not suitable for using UUID as the physical primary key at this time, it can be used as the logical primary key, and the physical primary key still uses the auto-incrementing ID.

First, innodb performs physical sorting on the primary key. this is good news for auto_increment_int, because the last primary key is always at the end. However, for uuid, this is a bad message, because uuid is disorganized and the position of the primary key inserted each time is uncertain. it may be at the beginning or in the middle, during physical sorting of primary keys, a large number of IO operations will inevitably affect the efficiency.

There are not many solutions to this problem. The common method is that the primary key still uses auto_increment_int, and a uuid is added for the unique index. what is the foreign key association? uuid is also used, that is to say, auto_increment_int is only a formal primary key, while uuid is the actual primary key. in this way, the int primary key does not waste much space, and you can continue to use uuid.

1. the simplest method

For four databases, the first mysql primary key is increased from 1 to 4 each time, and the second mysql primary key is increased from 2 to 4 each time, and so on ..

2. build a sequence server

2.1. N mysql servers are used as the sequence server to prevent single point of failure.

2.2 each table on each server represents a sequence, and each table also has only one record (table-level lock, using the myisam engine ).

2.3 The sequence of the first server is increased by N each time starting from 1, and that of the second server is increased by N each time starting from 2.

2.4 obtain nextVal from the first server and modify nextVal plus N. If the first Server fails to be obtained, it is obtained from the second Server ..

Modify the step size for automatic MySQL growth by default

Set global auto_increment_increment = 1;-set the sequence growth value show global variables;-display all global variables show global variables like '% test %'-query the global variables containing the test string

BitsCN.com

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Primary key solution for databases in a distributed environment _ MySQL

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Primary key solution for databases in a distributed environment _ MySQL

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support