Primary Key design of databases

Source: Internet
Author: User
In our database design, the primary key of the database table is unavoidable. Many may not think deeply about it. The design of the primary key has a great impact on the design of the entire database, therefore, we have to pay attention to it. Necessity of primary key: Some friends may not advocate that database tables must have primary keys, but in my thinking, each table should have primary keys, whether single or double primary keys, the existence of a primary key represents the integrity of the table structure. The table record must have unique and distinguished fields. The primary key is mainly used for foreign key Association of other tables. This record is modified and deleted, when we do not have a primary key, these operations will become very troublesome. Meaningless primary key : I stress that the primary key should not be of practical significance. This may not be recognized by some friends. For example, in the order table, there will be an "Order Number" field, in practice, this field should be unique and have a unique identification record function. However, we do not recommend that you use the order number field as the primary key because it has practical fields, there is a possibility of "meaning change". For example, when the order number was at the beginning, we went smoothly. Then the customer said, "the order can be voided and the order can be regenerated, the order number must be consistent with the original order number, so that the original primary key is in danger. Therefore, a unique field can also be used as a primary key. Therefore, we recommend that you create a new field that is used as the primary key. This primary key is not reflected in the business logic and does not have practical significance. This primary key is Program The complexity is increased, so it depends on the size of the actual system. For small projects, if the expansion is not very large in the future, it is also allowed to use the actual unique field as the primary key. Primary Key Selection We are now thinking about what should be used as the primary key of the table. It is reasonable to declare that the design of the primary key has no final conclusion. Each person has its own method, even if it is the same, different primary key design principles are also used in different projects. 1: ID as the primary key This method uses the "Number" of the unique field in the actual business as the primary key design. This is recommended in small projects because it can simplify the project, however, it may cause some troubles in use. For example, when "number modification" is required, many other associated tables may be involved, as Uncle Li said, "the consequences are very serious "; in addition, as mentioned above, "when the business request allows duplicate numbers", we cannot know what the business will be changed to if we know it again? Second: automatically numbered primary keys This method is also used by many friends. It is used to create an ID field and automatically increase it. It is very convenient and meets the primary key principle. The advantage is that the database is automatically numbered, and the speed is fast, in addition, it is incremental growth, and the clustered primary keys are stored in order, which is very advantageous for retrieval. The numeric type occupies a small amount of space and is easy to sort, and it is also convenient to transmit in the program; it is very convenient to add records (for example, manually input records, insert new records in the table using other tools, or import old system data) without worrying about duplicate primary keys. Disadvantage: in fact, the disadvantage is its advantage, that is, because of automatic growth, it will be difficult to manually insert records with the specified ID, especially when the system is integrated with other systems, when you need to import data, it is difficult to ensure that the ID of the original system does not conflict with the primary key (the premise is that the old system is also digital); if the primary key of other systems is not digital, it will be more troublesome, the primary key data type will be modified, which will also lead to modification of other related tables, with the same serious consequences. Even if other systems are also digital, in order to distinguish between new and old data during import, you may want to add an "O" (old) in front of the primary key of the old data to indicate that this is the old data, so the Automatically increasing number type faces another challenge. Third: Max Add one Because of the problems with automatic numbers, some friends use their own generation, which is also a digital type. They just remove automatic growth and add one after reading the max value during insert, this method can avoid the problem of automatic numbering, but there is also a efficiency problem. If the record is very large, max () will also affect the efficiency. What's more serious is the concurrency problem, if two people read the same Max at the same time, the ID value inserted after adding one will be repeated, which is already a lesson. Fourth: Self-made plus one After considering the efficiency of Max plus one, some people use the self-made plus one, that is, to create a special table, the field is: Table name, the current sequence value. In this way, when you insert a value to a table, first find the maximum value of the corresponding table and add one to it. Someone may find that there may also be concurrent processing. This concurrent processing, we can use the lock thread method to avoid this. When this value is generated, we first lock it, get the value, and then unlock it out, so that no one will generate it at the same time. This is much faster than Max plus one. But there is also a problem: when integrating with other systems, it is difficult to ensure that the maximum value in the self-made table is consistent with that in the imported table, in addition, the number type all has the problem of importing old "O" data mentioned above. Therefore, you can set the primary key as the primary key in the "auto-increment" method. I recommend the auto-increment of the primary key. The primary key can handle many unexpected situations. Fifth: guid Primary Key At present, a better primary key uses guid. Of course, I recommend the primary key or primary key, but the value is generated by guid. The GUID can be automatically generated or generated by a program, in addition, key values cannot be duplicated, which can solve the system integration problem. When the guid values of several systems are imported together, they will not be duplicated. Even old data with "O" can be distinguished, and the efficiency is very high, in. net can directly use system. guid. newguid () is generated and can be used in SQL Newid () Generate. Advantage: compared with the identity column, the uniqueidentifier column can use the newid () function to know the newly added row ID in advance, which provides great convenience for subsequent processing of the application. It is easy to port the database. Other databases do not necessarily have the identity column. The GUID column can be converted to other databases as a sort column, and the guid value generated by the application is saved to the database, it does not affect the original data. To facilitate database initialization, if the application needs to load some initial data, the processing of the identity column is more troublesome, And the uniqueidentifier column does not need to be processed, directly load with a T-SQL. It is easy to permanently identify certain objects or constants, such as class classid, instance id of objects, contacts, service interfaces, and tModel identification definitions in UDDI. The disadvantage is that the guid value is long and is difficult to remember and input. In addition, the random and unordered guid value contains 16 bytes, it is relatively larger than other 4-byte integers. This means that if the uniqueidentifier key is used in the database, there may be two negative effects: increased storage space and slow indexing time. I don't recommend the best guid either. In fact, in different situations, we can all use one of the above methods to think about some advantages and disadvantages, which is also convenient for your reference during design. These are just some of my thoughts. In addition, I may have some limitations on my knowledge. I hope you will have some ideas to discuss. Http://

This article is from 51cto. com technical blog

1. Do you want to use guid as the primary key?

Using guid as the primary key has its advantages and disadvantages. the advantage is that GUID is unique and can generate globally unique values under any circumstances. this is the biggest advantage of guid and facilitates data import. For example, if you want to import data from another system, you don't have to worry about it. During the import, primary key conflicts will occur. the GUID value is too complex. it is hard to remember, because sometimes we will use the record method for record judgment. and the data is too long, which affects the database efficiency. the generation of guids is not in a certain order. For a database physically ordered by the primary key, if a record is inserted at the front of the record, it may cause the number of data records to be removed after the nth power. this results in data insertion efficiency. therefore, use guid with caution.

2. Do you want to use auto-incrementing mode?

The primary key mentioned earlier requires uniqueness, so we use auto increment. this method is very undesirable. it may be to facilitate the insertion of records without having to manually create primary key values. I thought this would be convenient, but not actually. the trouble is far better than the so-called "convenience ". first, it is inconvenient to import data. Data is often imported from another system. The auto-incrementing primary key does not allow the IDs in the original table to be imported. this will cause the loss of the primary key. second: For tables with primary foreign keys such as orders, if the "primary table" primary key of the Order is automatically generated. when saving an order, you need to store transactions in the same table and table. At this time, you must first generate an order and then retrieve the primary key automatically generated by this order, save the details as a foreign key of the detail table. this process will become complex and unfeasible. how to handle the transaction. after an order is inserted to the primary table, if an error occurs while saving the details, the records of the primary table must be deleted. annoying. after successful insertion, the maximum value is also taken out. this is a serious waste. if there are too many records, the speed will be affected and there will be parallel inserts. the obtained records may be incorrect. therefore, do not use auto-incrementing mode for the above serious problems.

3. Whether int type is used as the primary key

In the past, we used the int type. All primary keys are numbers. in fact, we also understand. it is not just a number, but a number. for example, phone number. therefore, the advantage of using the int type for primary keys is that the speed is fast, and the insertion and query operations may be faster than other methods. however, my quick results may not be much obvious. For example, in varchar (15), data sorted by the physical primary key is automatically sorted by the primary key. therefore, even the data in bytes type is inserted to the corresponding physical location during insertion, that is, the insertion may affect the speed. however, in future queries, the speed impact will not be too obvious. what I want to say is that the int type is not used as the primary key. It does not mean that data is not stored in it. I suggest you store numbers in the primary key. Such sorting is faster than sorting with mixed letters. The reason for adopting the character type is to prepare for future data import. One day, when importing data from other tables, you can add a specific letter to the primary key of the imported data to avoid conflicts with the original primary key. for example, add an "N" letter before the primary key of the imported data. this eliminates the need to worry about whether the primary key of the imported data table is numeric or numeric.

4. whether to use numbers to define the primary key

This is a common issue. the primary key design principle is that the primary key should not have any practical significance. this article is actually very important. Some people think that the number itself is unique and can be used as the primary key, but it may cause troubles in the future. because fields with actual meanings still have the possibility of being modified. the biggest taboo for a primary key is modifying the primary key, which may lead to very serious unmeasurable consequences. for example, the student ID is never modified, but it may still exist.

Another type is unique on the surface, but should actually allow repetition. for example, the order number should be unique. yes. however, in this case, an original order is required to be voided for some reason. the order status is marked as "cancel ". then, you can enter orders of the same number again. therefore. in this case, although there is only one valid order number, duplicate numbers are allowed in the database. therefore, we recommend that you create a primary key with no significance for the table, such as ID.

So, to sum up, I will adopt the primary key type in the design. auto increment is not used. When a record is added, the system generates a primary key value. generally, it is stored in full numbers. As for the rules for generating primary key values, rules can be defined as needed. if there are no special requirements, you can define a field to store a value to keep it unique. automatically add one at the time of production. and save it back. this is faster than finding the maximum value from a table.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.