How to select the primary key of a Database

Source: Internet
Author: User
The non-significance of the primary key I emphasize that the primary key should not have practical significance. This may not be recognized by some friends. For example, in the order table, there will be an "Order Number" field, in practice, this field should be unique and have a unique identification record function. However, I do not recommend using the order number field as the primary key because it has actual

The non-significance of the primary key I emphasize that the primary key should not have practical significance. This may not be recognized by some friends. For example, in the order table, there will be an "Order Number" field, in practice, this field should be unique and have a unique identification record function. However, I do not recommend using the order number field as the primary key because it has actual

Meaningless primary key 

I stress that the primary key should not be of practical significance. This may not be recognized by some friends. For example, in the order table, there will be an "Order Number" field, in practice, this field should be unique and have a unique identification record function. However, we do not recommend that you use the order number field as the primary key because it has practical fields, there is a possibility of "meaning change". For example, when the order number was at the beginning, we went smoothly. Then the customer said, "the order can be voided and the order can be regenerated, the order number must be consistent with the original order number, so that the original primary key is in danger. Therefore, a unique field can also be used as a primary key. Therefore, we recommend that you create a new field that is used as the primary key. This primary key is not reflected in the business logic and does not have practical significance. This type of primary key increases the complexity of a certain program, so it depends on the size of the actual system. For small projects, the expansion will not be very large in the future, it also allows the use of actually unique fields as the primary key.

Necessity of primary key
  

Some friends may not advocate that database tables must have primary keys, but in my thinking, each table should have primary keys, whether single or double primary keys, the existence of a primary key represents the integrity of the table structure. The table record must have unique and distinguished fields. The primary key is mainly used for foreign key Association of other tables. This record is modified and deleted, when we do not have a primary key, these operations will become very troublesome.

Primary Key Selection
  

We are now thinking about what should be used as the primary key of the table. It is reasonable to declare that the design of the primary key has no final conclusion. Each person has its own method, even if it is the same, different primary key design principles are also used in different projects.
1: ID as the primary key
This method uses the "Number" of the unique field in the actual business as the primary key design. This is recommended in small projects because it can simplify the project, however, it may cause some troubles in use. For example, when "number modification" is required, many other associated tables may be involved, as Uncle Li said, "the consequences are very serious "; in addition, as mentioned above, "when the business request allows duplicate numbers", we cannot know what the business will be changed to if we know it again?
Second: automatically numbered primary keys
This method is also used by many friends. It is used to create an ID field and automatically increase it. It is very convenient and meets the primary key principle. The advantage is that the database is automatically numbered, and the speed is fast, in addition, it is incremental growth, and the clustered primary keys are stored in order, which is very advantageous for retrieval. The numeric type occupies a small amount of space and is easy to sort, and it is also convenient to transmit in the program; it is very convenient to add records (for example, manually input records, insert new records in the table using other tools, or import old system data) without worrying about duplicate primary keys.
Disadvantage: in fact, the disadvantage is its advantage, that is, because of automatic growth, it will be difficult to manually insert records with the specified ID, especially when the system is integrated with other systems, when you need to import data, it is difficult to ensure that the ID of the original system does not conflict with the primary key (the premise is that the old system is also digital); if the primary key of other systems is not digital, it will be more troublesome, the primary key data type will be modified, which will also lead to modification of other related tables, with the same serious consequences. Even if other systems are also digital, in order to distinguish between new and old data during import, you may want to add an "o" (old) in front of the primary key of the old data to indicate that this is the old data, so the Automatically increasing number type faces another challenge.
Third: Max plus one
Because of the problems with automatic numbers, some friends use their own generation, which is also a digital type. They just remove automatic growth and add one after reading the Max value during Insert, this method can avoid the problem of automatic numbering, but there is also a efficiency problem. If the record is very large, Max () will also affect the efficiency. What's more serious is the concurrency problem, if two people read the same Max at the same time, the ID value inserted after adding one will be repeated, which is already a lesson.
Fourth: Self-made plus one
After considering the efficiency of Max plus one, some people use the self-made plus one, that is, to create a special table, the field is: Table name, the current sequence value. In this way, when you insert a value to a table, first find the maximum value of the corresponding table and add one to it. Someone may find that there may also be concurrent processing. This concurrent processing, we can use the lock thread method to avoid this. When this value is generated, we first Lock it, get the value, and then unLock it out, so that no one will generate it at the same time. This is much faster than Max plus one. But there is also a problem: when integrating with other systems, it is difficult to ensure that the maximum value in the self-made table is consistent with that in the imported table, in addition, the number type all has the problem of importing old "o" data mentioned above. Therefore, you can set the primary key as the primary key in the "auto-increment" method. I recommend the auto-increment of the primary key. The primary key can handle many unexpected situations.
Fifth: GUID primary key
At present, a better primary key uses GUID. Of course, I recommend the primary key or primary key, but the value is generated by GUID. The GUID can be automatically generated or generated by a program, in addition, key values cannot be duplicated, which can solve the system integration problem. When the GUID values of several systems are imported together, they will not be duplicated. Even old data with "o" can be distinguished, and the efficiency is very high, in. NET can directly use System. guid. newGuid () can also be generated using NewID () in SQL.
Advantages:
Compared with the IDENTITY column, the uniqueidentifier column can use the NewID () function to know the newly added row ID in advance, which provides great convenience for subsequent processing of the application.
It is easy to port the database. Other databases do not necessarily have the IDENTITY column. The Guid column can be converted to other databases as a sort column, and the GUID value generated by the application is saved to the database, it does not affect the original data.
To facilitate database initialization, if the application needs to load some initial data, the processing of the IDENTITY column is more troublesome, And the uniqueidentifier column does not need to be processed, directly load with a T-SQL.
It is easy to permanently identify certain objects or constants, such as class ClassID, instance id of objects, contacts, service interfaces, and tModel identification definitions in UDDI.
Disadvantages:
The GUID value is long and is difficult to remember and input, and is random and unordered.
The GUID value has 16 bytes, which is relatively larger than other 4-byte integers. This means that if the uniqueidentifier key is used in the database, there may be two negative effects: increased storage space and slow indexing time.
I don't recommend the best GUID either. In fact, in different situations, we can all use one of the above methods to think about some advantages and disadvantages, which is also convenient for your reference during design. These are just some of my thoughts. In addition, I may have some limitations on my knowledge. I hope you will have some ideas to discuss.
Source: bean sprout blog, URL: http://www.aichengxu.com/article/MySQL/193_10.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.