Golang Distributed ID Generation Service

Source: Internet
Author: User
This is a creation in Article, where the information may have evolved or changed.

The weekend spent a night, with Go wrote an ID generation service, GitHub address: Go-id-alloc.

Distributed ID generation, as far as I can see is mainly 2 schools, each has pros and cons, no perfect implementation.

1,snowflake genre.

It is used for Twitter's microblog ID, because it is timeline sorted by release time, so this algorithm uses millisecond timestamps as the left half of the ID, so it can be ordered by time.

Like Sina Weibo is also in the use of similar ID generation algorithm, the benefits of snowflake is to be centralized, but dependent on the accuracy of the clock, the worst case is that the clock has been rolled back, then the ID will be repeated, and if the NTP synchronization clock is not turned on, then the different nodes allocated different time, will also affect the order of feed flow, so in my opinion can only say that basically available, once the clock back to the larger interval, the service is completely unavailable. The United States in this area to do some work, mainly in the discovery of the return and alarm aspects, you can refer to: leaf-reviews Distributed ID generation system.

2,mysql genre.

The genre is widely used, and the basic principle is MySQL's self-increment primary key. Initially, to scale performance, you would deploy multiple MySQL, setting a different starting ID for each MySQL, for scale-out.

MySQL supports the Auto_increment property of a custom table and can be used to control the start ID:

Transact-SQL <textarea class="crayon-plain print-no" data-settings="dblclick" readonly="" style="-moz-tab-size:4; -o-tab-size:4; -webkit-tab-size:4; tab-size:4; font-size: 12px !important; line-height: 15px !important;">CREATE TABLE ' partition_1 ' (' ID ' bigint () not null auto_increment, ' meanless ' tinyint (4) is not NULL, PRIMARY KEY (' ID '), UNIQUE KEY ' meanless ' (' meanless ')) Engine=innodb DEFAULT Charset=utf8; ALTER TABLE ' partition_1 ' auto_increment=1;</textarea>
12345678 CREATE TABLE `partition_1` ( `ID` bigint( -) not NULL auto_increment, `meanless` tinyint(4) not NULL, PRIMARY KEY (`ID`), UNIQUE KEY `meanless` (`meanless`)) ENGINE=InnoDB DEFAULT CHARSET=UTF8;ALTER TABLE `partition_1` auto_increment=1;

Alter modifies the starting id=1 of the parition_1 table, and different tables can set different starting IDs, such as setting auto_increment=2 to Partition_2.

This alone is not enough because the next self-increment id=2 of the default partition_1 will duplicate the ID assigned to the partition_2 table.

MySQL provides another configuration, called auto_increment_increment, that can be set at the MySQL session level (set auto_increment_increment=xxx;), or it can be set to the entire MySQL instance. This configuration is used to control the step size, in the above example I will set auto_increment_increment=2, then the ID allocation of 2 partition table is as follows:

    • 1,3,5,7,9 ...
    • 2,4,6,8,10 ...

You will find that the step is set to the number of partitions, you can avoid the ID conflict, the whole to a larger ID together grow.

So how do you assign the next ID? The general Insert record creates a new ID, which causes the data to grow in size, and a better approach is to use the Replace command:

Shell <textarea class="crayon-plain print-no" data-settings="dblclick" readonly="" style="-moz-tab-size:4; -o-tab-size:4; -webkit-tab-size:4; tab-size:4; font-size: 12px !important; line-height: 15px !important;">replace into partition_0 (' meanless ') VALUES (0)</textarea>
1 Replace into partition_0 ('meanless') values(0)

Because the meanless unique key is the reason, the ID field is self-growing and only produces one record at most.

In my Go-id-alloc project, this is the way to implement ID self-increment, but this is not enough. Database updates after all, there is a performance bottleneck that will eventually become a bottleneck in a more stressful business scenario.

The new scheme is based on the MySQL self-increment principle, which reduces the write pressure of the database to negligible by means of "number segment" batch acquisition, the principle of which is explained below.

Still with the above allocation layout as an example, two MySQL generates the following sequence of IDs, respectively:

    • 1,3,5,7,9 ...
    • 2,4,6,8,10 ...

Now I'm assuming a number segment length of 10000, then when replace produces a id=1, the assignment gets [0, 10000) this segment. Similarly, when the replace is generated id=3, then the allocation is given [20000, 30000] this segment.

By re-organizing the sequence based on the number segment, it will look like this:

    • [0, 10000], [20000, 30000), [40000,50000], [60000,70000], [80000,90000]
    • [10000,20000), [30000,40000], [50000,60000], [70000,80000], [90000,100000]

Did you observe the law? If the assigned ID is n, then the number segment is [(n–1) * size, n * size).

With the number segment, we only need to write a service, each time we assign an ID to MySQL, we get an exclusive number segment, and the next ID allocation request can be obtained directly from the number segment in memory. In addition, you should obtain a new number segment from MySQL before the number segment in memory is exhausted.

With the size of the lift segment, we can reduce the frequency of access to the database, thereby improving the service capability of the entire ID assignment.

MySQL failure

Because the ID sequence is saved in MySQL, MySQL loses data and becomes intolerable. Generally we have MySQL Master-slave mode to back up data in real-time, but after all, master-slave delay, the main library outage may cause the latest update is not synchronized to the slave library, it will cause re-allocation ID duplication.

Yes, this is a disadvantage of the MySQL program, just as Snowflake has its own weaknesses. However, usually this problem can be avoided by setting a larger number segment size, we can ensure that within a limited master-slave delay time (such as a 1-minute delay), according to the amount of business requests, up to a maximum of N-times replace generated ID from the increment. Under this assumption, we can skip a certain number of steps from the library's starting ID to make sure it doesn't repeat.

Usage Scenarios

The snowflake scheme is suitable for time-ordered scenarios, and the outside world cannot guess the total number of IDs assigned to a given day, thus unable to guess the business volume of a company, and the drawback is that the time fallback service is unavailable.

MySQL solution for internal business, the ID has more control (such as the definition of the starting ID), scalability is strong, can meet any volume of business scale, the disadvantage is to rely on MySQL, the other ID has a regular, easy to expose the company's business volume.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.