Is the UUID the primary key, or does it use self-increment?

Last Update:2016-06-06 Source: Internet

Author: User

Keywords Php java asp.net mysql oracle

Tags unique id

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I looked for a long time on the internet, some people say UUID is better, is in the sub-database table, merge data what is easier, also some people say self-increase, to the table of data more time, performance than the UUID is much better, in the end that kind of better? Is there a great God in the real production environment to give a complete answer?

Reply content:

UUIDThe first performance is not 自增ID much worse, depending on UUID the generation algorithm. For example, MongoDB ObjectId It is a good UUID strategy, the composition is 时间戳+机器码+进程码+自增数 that the machine code and process code can be generated at once, so that a ObjectId mere more than a 自增ID time stamp to get. In addition, considering that the 自增ID key is a unique index, and UUID can only do index, do not make a unique index (using its characteristics, can not consider the uniqueness of filtering), its performance can be said to be no 自增ID worse.

Whether the use UUID or the 自增ID main or whether the project is large enough, the amount of data is sufficient. From the convenience of the use of 自增ID simple, no need for additional support, and UUID relatively troublesome, involving the UUID selection of algorithms, program embedding and so on. And it's much better to deal with the effect of a huge system UUID than it 自增ID looks. How to choose is to see their own situation, on-demand choice.

Oracle Environment (other databases have no experience with it):
In the long run, the self-increment number occupies more space than the UUID (Raw (16) occupies 32 bytes), but the average table has a data volume of tens of billions of or less space.
UUID uses no number convenient, the UUID is stored as raw (16) When the query condition requires the use of hextoraw conversion;
In the case of the same number of rows, the UUID-based index takes up more space than number, and the amount of data reads the disk more often;
In a concurrent or RAC cluster environment, the performance of the hot block phenomenon UUID is higher than number (this problem can be solved by establishing a reverse index);
UUID is more secure when used in foreground display, number can guess

If this ID is exposed to the user (part of the URL), then the self-increment ID is more friendly.
2 gaps in performance are not too big
For ease of development, the UUID is more convenient than the self-increment ID.
- UUID can resolve the distributed ID does not conflict, the database self-increment ID is not supported.
- UUID can be obtained before committing to the database, no more than one SELECT statement
- Not all databases support the self-increment ID
The UUID can be compressed to 12-16-bit lengths without the standard 36-bit length, which can be base64.
The UUID scheme is preferred, and if more friendly URLs are needed, sequence should be used instead of the self-increment ID

Most of the answers above from the developer perspective to compare UUID generation efficiency, and database access efficiency cases see, with InnoDB or TOKUDB, the answer is to self-increment the primary key, see Oracle Ace this answer:
Why the InnoDB table is recommended to use the self-increment column master key

On-demand, specific problems specific analysis, in short, the general 32-byte UUID can be generalized

Self-increment has advantages over indexes (the comparison of integers is faster than string comparisons), but the advantages are limited. The main difference between the two is that the ID is generated on the client side of the/server side. The Java program is a client side relative to MySQL. The UUID ensures that the client side generates a unique and non-duplicate ID, which is important in a distributed system.

If you don't feel it, I suggest you use the UUID.

This depends on the selection of your underlying storage, Oracle I am not familiar with, he used ROWID to complete the bottom, so do not comment. Other like MongoDB has its own particularity, so it is appropriate to use the UUID to reverse.

But if your underlying storage is MySQL, and the storage engine happens to be innodb, then I highly recommend that you use the only self-increment number as your PK;

First of all, not to discuss the UUID and long these two data in the CPU to calculate the number of instructions and CPU calculation period, the most important is that the underlying storage of InnoDB is B+tree, need to follow a clustered index, All data query operations are done based on the clustered index (if the table is designed PK, the clustered index is the default selection of this PK)

If the data write order of the InnoDB table is the same as the leaf nodes in the B + Tree index, then the access efficiency is the highest.

Of course, if your system does not pursue the limits of performance, then play it casually.

Why not customize your own primary key strategy?

MySQL uses the self-increment, otherwise innodb also must maintain the primary key index itself.

If it is a standalone machine, we recommend using MySQL's self-increment ID.

If you are in a distributed environment, you want to keep the global unique ID, neither the increment nor the UUID.
This time, you need to customize a set of ID generator machine mechanism

Note
I've encountered a duplicate GUID, so it's a potential problem to do a unique primary key directly with a GUID!

(GUID is an implementation of the UUID standard)

This feeling still depends on the specific scene, if it is just a database server, and the business logic is not complex, can be used in the way of self-increment ID, simple development; If only a small number of primary database servers can also use the self-increment ID, but need to do some processing.

For distributed, that is, there are multiple primary database server situation and then use the self-increment ID will make the development complex, the self-increment ID will become more and more difficult to control, then why not use UUID, after all, the performance of the two are not much worse



This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More