I looked for a long time on the internet, some people say UUID is better, is in the sub-database table, merge data what is easier, also some people say self-increase, to the table of data more time, performance than the UUID is much better, in the end that kind of better? Is there a great God in the real production environment to give a complete answer?
Reply content:
I looked for a long time on the internet, some people say UUID is better, is in the sub-database table, merge data what is easier, also some people say self-increase, to the table of data more time, performance than the UUID is much better, in the end that kind of better? Is there a great God in the real production environment to give a complete answer?
UUID
The first performance is not 自增ID
much worse, depending on UUID
the generation algorithm. For example, MongoDB
ObjectId
It is a good UUID
strategy, the composition is 时间戳+机器码+进程码+自增数
that the machine code and process code can be generated at once, so that a ObjectId
mere more than a 自增ID
time stamp to get. In addition, considering that the 自增ID
key is a unique index, and UUID
can only do index, do not make a unique index (using its characteristics, can not consider the uniqueness of filtering), its performance can be said to be no 自增ID
worse.
Whether the use UUID
or the 自增ID
main or whether the project is large enough, the amount of data is sufficient. From the convenience of the use of 自增ID
simple, no need for additional support, and UUID
relatively troublesome, involving the UUID
selection of algorithms, program embedding and so on. And it's much better to deal with the effect of a huge system UUID
than it 自增ID
looks. How to choose is to see their own situation, on-demand choice.
Oracle Environment (other databases have no experience with it):
In the long run, the self-increment number occupies more space than the UUID (Raw (16) occupies 32 bytes), but the average table has a data volume of tens of billions of or less space.
UUID uses no number convenient, the UUID is stored as raw (16) When the query condition requires the use of hextoraw conversion;
In the case of the same number of rows, the UUID-based index takes up more space than number, and the amount of data reads the disk more often;
In a concurrent or RAC cluster environment, the performance of the hot block phenomenon UUID is higher than number (this problem can be solved by establishing a reverse index);
UUID is more secure when used in foreground display, number can guess
If this ID is exposed to the user (part of the URL), then the self-increment ID is more friendly.
2 gaps in performance are not too big
For ease of development, the UUID is more convenient than the self-increment ID.
UUID can resolve the distributed ID does not conflict, the database self-increment ID is not supported.
UUID can be obtained before committing to the database, no more than one SELECT statement
Not all databases support the self-increment ID
The UUID can be compressed to 12-16-bit lengths without the standard 36-bit length, which can be base64.
The UUID scheme is preferred, and if more friendly URLs are needed, sequence should be used instead of the self-increment ID
Most of the answers above from the developer perspective to compare UUID generation efficiency, and database access efficiency cases see, with InnoDB or TOKUDB, the answer is to self-increment the primary key, see Oracle Ace this answer:
Why the InnoDB table is recommended to use the self-increment column master key
On-demand, specific problems specific analysis, in short, the general 32-byte UUID can be generalized
Self-increment has advantages over indexes (the comparison of integers is faster than string comparisons), but the advantages are limited. The main difference between the two is that the ID is generated on the client side of the/server side. The Java program is a client side relative to MySQL. The UUID ensures that the client side generates a unique and non-duplicate ID, which is important in a distributed system.
If you don't feel it, I suggest you use the UUID.
This depends on the selection of your underlying storage, Oracle I am not familiar with, he used ROWID to complete the bottom, so do not comment. Other like MongoDB has its own particularity, so it is appropriate to use the UUID to reverse.
But if your underlying storage is MySQL, and the storage engine happens to be innodb, then I highly recommend that you use the only self-increment number as your PK;
First of all, not to discuss the UUID and long these two data in the CPU to calculate the number of instructions and CPU calculation period, the most important is that the underlying storage of InnoDB is B+tree, need to follow a clustered index, All data query operations are done based on the clustered index (if the table is designed PK, the clustered index is the default selection of this PK)
If the data write order of the InnoDB table is the same as the leaf nodes in the B + Tree index, then the access efficiency is the highest.
Of course, if your system does not pursue the limits of performance, then play it casually.
Why not customize your own primary key strategy?
MySQL uses the self-increment, otherwise innodb also must maintain the primary key index itself.
If it is a standalone machine, we recommend using MySQL's self-increment ID.
If you are in a distributed environment, you want to keep the global unique ID, neither the increment nor the UUID.
This time, you need to customize a set of ID generator machine mechanism
Note
I've encountered a duplicate GUID, so it's a potential problem to do a unique primary key directly with a GUID!
(GUID is an implementation of the UUID standard)
This feeling still depends on the specific scene, if it is just a database server, and the business logic is not complex, can be used in the way of self-increment ID, simple development; If only a small number of primary database servers can also use the self-increment ID, but need to do some processing.
For distributed, that is, there are multiple primary database server situation and then use the self-increment ID will make the development complex, the self-increment ID will become more and more difficult to control, then why not use UUID, after all, the performance of the two are not much worse