Mysql primary key selection

Source: Internet
Author: User

Recent studies on uuid have collected some data:

 

Http://www.mysqlops.com/2011/09/10/innodb-primary.html

(For the InnoDB engine) in our actual production environment, four types of attributes may be used as the primary key:

(1). Auto-incrementing sequence;

(2) random value generated by the UUID () function;

(3) unique account name registered by the user. It is a string of 40 characters;

(4) generate self-incrementing values based on a set of mechanisms, such as sequence generators;

Next, we will analyze the advantages and disadvantages of these four types of attributes as the table primary key:

(1 ). auto-incrementing sequence: new values are added in ascending order or ascending order mode. The data type also facilitates primary key value comparison. The storage space is also relatively small and generally set: four-byte INT type or eight-byte BIGINT type. To split data horizontally, you can also set two parameters of the mysqld instance: auto_increment_increment and auto_increment_offset. In addition, the only drawback is that auto-incrementing sequence is a global lock at the table level. In the 5.0 series of large-scale concurrent writes, the bottleneck may occur due to the lock release mechanism, but the 5.1 series has been improved, this problem basically does not exist;

(2 ). UUID () function: the value is random + fixed. The value is unordered, and the value of the same server is 77.8%; the number of characters generated is 36, calculated based on UTF-8 encoding. The storage space occupied is 36 bytes. Supports horizontal data splitting without special settings;

(3 ). use the account name registered by the user, string type. The generation of the value depends on the user input. Therefore, the data is basically unordered and the length of the string is also variable, you can only control the limit of the shortest and maximum length through the previous section. Horizontal splitting is supported without special settings;

(4 ). the sequence generator architecture is similar to auto-incrementing sequence. However, you need to use additional development workload and provide a third-party service to avoid the global lock problem of auto-incrementing sequence words, improve concurrency and better support for horizontal data splitting;

(5 ). scenarios where the architecture of Dual-master replication is probabilistic: when the data on the master server is successfully executed but not replicated to the online backup server, the probability of a problem exists, manual intervention is also required, and there are no simple and reasonable automation methods. None of the above four methods can be avoided;

Through the analysis of the advantages and disadvantages of the four attribute values as the primary key, and the comparison of the excellent qualities required by the primary key described above, if the horizontal split problem is not considered, it will cause extra setup troubles, auto-incrementing sequence is the best choice of primary key fields. If the user's registered account requires uniqueness and is not empty, it can be used as the primary key field. If you consider horizontal splitting, the auto-incrementing sequence generator architecture is used for easy-to-use and reliable implementation. The generated values are the best choice of primary key fields;

 

Detailed description of MySQL UUID Functions

Http://www.mysqlops.com/2011/03/01/mysql-uuid.html

MySQL can use the uuid () function and auto-incrementing sequence to generate unique values. What is the difference between them? In this regard, we will compare their respective features and similarities and differences:

L can generate unique values;

L UUID is a unique value that can be used to generate time and space. The auto-incrementing sequence can only generate a unique

One value, and must be matched to make it a unique primary key or unique index;

L different implementation methods. UUID is a combination of random + rules, while auto-incrementing sequence controls the gradual increase of a value;

L UUID generates a string value with a fixed length of 36 characters, while the auto-increment sequence generates an integer value. The length is determined by the field-defined attribute;

Next, we will explain in detail the values produced by the UUID () function:

Oot @ localhost: (none) 06:09:40> select uuid (), UUID (), LENGTH (UUID (), CHAR_LENGTH (UUID () \ G

* *************************** 1. row ***************************

UUID (): de7ee638-4322-11e0-85ab-842b2b4a7e75

UUID (): de7ee642-4322-11e0-85ab-842b2b4a7e75

LENGTH (UUID (): 36

CHAR_LENGTH (UUID (): 36

1 row in set (0.00 sec)

As shown in the preceding execution result:

L in the same SQL statement, the values obtained by calling the UUID () function in multiple places are different;

L The random value is composed of five parts and the separator is a hyphen;

L The values of the last two groups obtained after multiple calls or executions are the same. If the mysqld server is disabled and restarted, the group in the fourth group and the value before the restart are changed, then it remains unchanged. As long as you restart the mysqld service, it will change. In addition, for the same machine, the fifth group of values will never change;

L The number of characters is 36, and the number of characters is 36 (Note: The default character set encoding: utf8 );

The following describes the components of UUID values:

L The first three sets of values are converted from timestamps;

L The fourth group of values is to temporarily preserve the uniqueness of the timestamp. For example;

L The Fifth group of values is an IEE 802 node id value, which is unique in space. If the latter is not available, replace it with a random number. If the host does not have a nic, or we do not know how to obtain the machine address in a system, the uniqueness of the space cannot be guaranteed. Even if this problem occurs, the chance of duplicate values is very small.

UUID Functions Support replication:

The UUID function is an uncertain function. Therefore, the STATEMENT mode of MySQL replication is not supported, but the MIXED and ROW modes are supported. You can set two groups of test modes to 5. 1. the series version is used as an example.

Test command line-based replication:

Tx_isolation = REPEATABLE-READ

Binlog_format = STATEMENT

Test command line/Mixed Mode replication:

Tx_isolation = REPEATABLE-READ

Binlog_format = MIXED OR ROW

Execute the same SQL statement on the master server:

Insert into test_uuid (username) VALUES (UUID ());

Then compare the values stored in the preceding table on the master and slave servers, and you will find that the master and slave data are consistent in the command line mode and in the row/hybrid mode;

Suggestion: In the replication mode, if you need the UUID () function, you must use the row/hybrid replication mode.

Glossary:

When the input parameters are the same and executed at the same time or multiple calls in an SQL statement, different worthwhile functions are called as undefined functions.

Note:

There is a variant UUID () function in MySQL 5.1. * and later, with the name UUID_SHORT (), to generate a 64-bit unsigned integer, for example:

Root @ localhost: (none) 02:46:42> SELECT UUID_SHORT () \ G

* *************************** 1. row ***************************

UUID_SHORT (): 6218676250261585921

1 row in set (0.00 sec)

Subsequent notes:

The value generated by the UUID () function is not suitable for the primary key of the InnoDB Engine table. For details, see the InnoDB Engine table primary key selection article.

 

 

Is it good to use UUID as the primary key? This is a problem.

Http://mlxia.iteye.com/blog/279059

Author: Lao Wang

The only database that I am still familiar with is MySQL. Probably MySQL users, more than 9% will use Autoincrement ID as the primary key. This is understandable, because the auto-increment ID of MySQL is very efficient and convenient to use. So what do the remaining 1% users use as the primary key? It may be your own KeyGenerator, or the UUID we will talk about below.

It is said that in the Oracle circle, if the user uses auto-incrementing ID as the primary key is to be despised, the most natural choice of the primary key is UUID. I am not familiar with Oracle, and do not promise whether these conclusions are correct.

What is UUID first? In short, UUID refers to the number generated on a machine, which ensures that all machines in the same time and space are unique. In the UUID algorithm, information such as the nic mac address, IP address, host name, and process ID may be used to ensure its independence.

If your MySQL version is not old, type select uuid (); the output is UUID, as shown below:

Mysql> select uuid ();
+ -------------------------------------- +
| Uuid () |
+ -------------------------------------- +
| 54b4c01f-dce0-102a-a4e0-462c07a00c5e |
+ -------------------------------------- +

Now you should have a more intuitive understanding of UUID. Let's take a look at the advantages and disadvantages of UUID.

Advantages:

It can ensure independence, and the program can be migrated between different databases without affecting the effect.
Ensure that the generated IDs are not only table independent, but also database independent. This is especially important when you want to split the database.

Disadvantages:

Compared to the INT type, it takes more space to store a UUID.
After UUID is used, the URL is lengthy and unfriendly.

The following is my opinion on the shortcomings of the UUID mentioned above. I am not very concerned about this shortcoming, but the most valuable one is the hard disk. I can skip this shortcoming. As for the use of UUID, the URL seems unfriendly. I think this is the inertial Thinking Caused by your INT complex. In fact, compared with the INT type, UUID is the most natural primary key choice, note: I use the natural adjective here. I want to understand what you mean. In addition, in many cases, the URL itself does not need to be friendly. For example, for an e-commerce website, according to the INT-friendly URL, her order URL is probably in the following format:/order. php/id/123. What I want to note is that this is very friendly, but some are very friendly, friendly, or even insecure. For example, I placed an order in the morning, the URL is/order. php/id/1000. The URL for the next order in the evening is/order. php/id/2000, then I can estimate that the number of orders for this website on a day is about 1000, and I can even estimate the sales volume of this website, these data are often important commercial secrets. There is no such concern when using UUID.

Efficiency?

If none of the above-mentioned UUID's so-called shortcomings are true, then whether to use UUID as the primary key is the only problem is efficiency. It is said that there are dedicated UUID types in PostgreSQL and other databases. In such a database, using UUID as the primary key has no efficiency problems. Unfortunately, such fields are not found in MySQL, if you want to save the UUID in MySQL as the primary key, it is generally simulated using CHAR (36). Because it is not a native UUID type, how can the efficiency of the primary key be tested? In addition, the efficiency of UUID primary keys is also closely related to the UUID algorithm implementation.

I originally wanted to insert 1000000 pieces of data to my computer and test it. Unfortunately, the hard drive lights keep on, so I am worried that it will crash, although the hard disk is not worth the money, but I have all the important data on it. Once it is broken, the loss will be huge. So, we have to give up the test.

I do not know the efficiency of using UUID (stored in char (36) as the primary key in MySQL. Sorry -_-!!!

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.