Thoughts on logical primary key, business primary key, and composite primary key

Source: Internet
Author: User
Tags sql server books

Reproduced: http://blog.csdn.net/sunrise918/article/details/5575054

In the past few days, I have thought about logical primary keys, business primary keys, and composite primary keys. I also searched for some discussions on the Internet. For more information, see the following reference links. The following are some summary based on SQL Server. Other databases (Oracle, MySQL, DB2,...) should be similar. This is just a temporary thought. If you have any mistakes, please let us know and rethink about it before correcting it.

??

Definition (some definitions are from SQL Server books online ):

Primary Key: A table usually has one or more columns that contain the values of each row in a unique table. Such a column or multiple columns are called the primary key of the table, which is used to force the Object Integrity of the table.

Foreign key (FK) is a column or multiple column used to establish and enhance the link between two table data. In foreign key reference, when a table column is referenced as the primary key value of another table, a link is created between the two tables. This column becomes the foreign key of the second table.

Clustered index: the clustered index sorts and stores the data rows in the table based on the key values of the data rows. Each table can have only one clustered index, because data rows can only be stored in one order.

Non-clustered Index: A non-clustered index contains the index key value and the row locator pointing to the table data storage location. You can create multiple non-clustered indexes on a table or index view. Generally, the design of non-clustered indexes is to improve the performance of frequently used queries without the establishment of clustered indexes.

Automatic Number Column and identifier column: for each table, you can create an identifier column that contains the sequence number value generated by the system. The sequence number value uniquely identifies each row in the table.

Business primary key (natural primary key): In the database table, fields with business logic meanings are used as the primary key, which is called "natural primary key )".

Logical primary key (proxy primary key): In a database table, a field irrelevant to the logic information in the current table is used as its primary key, which is called "proxy primary key ".

Composite primary key (Federated primary key): uses a combination of two or more fields as the primary key.

??

Principle Analysis:

The main reason for using a logical primary key is that, once the business primary key changes, modifications to the part associated with the primary key in the system will be inevitable, and the more you reference, the larger the changes. To use a logical primary key, you only need to modify the business logic related to the corresponding business primary key, reducing the impact scope of the changes due to the business primary key. Changes in business logic are inevitable, because "Change is always the same", no company is the same, and no business is the same forever. The most typical example is the business change of ID card upgrading and driver's license number replacing with ID card number. In addition, there are indeed duplicate ID card numbers, which makes it difficult to handle the problem if the ID card number is used as the primary key. Of course, there can be many solutions to cope with changes. One solution is to make a new system keep pace with the times, which is indeed a good thing for software companies.

Another reason for using a logical primary key is that the business primary key is too large to facilitate transmission, processing, and storage. I think the logical primary key should be considered if the business primary key exceeds 8 bytes, because Int Is 4 bytes and bigint is 8 bytes, while the business primary key is generally a string, likewise, 8-byte bigint and 8-byte strings are naturally bigint more efficient in transmission and processing. Imagine the differences between code = "12345678" and the assembly code id = 12345678. Of course, the logical primary key is not necessarily Int or bigint, and the Business primary key is not necessarily a string or Int or datetime type. At the same time, the transmitted data is not necessarily a primary key, this will be analyzed in detail, but the principle is similar. Here we only discuss the general situation. If other tables need to reference this primary key, they also need to store this primary key, so the overhead of this bucket is also different. In addition, the reference field of these tables is usually a foreign key, or the index is usually used to facilitate search. This will also lead to different storage space overhead, which also requires specific analysis.

Another reason for using the logical primary key is that using Int or bigint as the foreign key for join query is faster than using string as the foreign key for join query. The principle is similar to the above.

Another reason for using the logical primary key is that users or maintenance personnel mistakenly input data to the business primary key. For example, if RMB is entered as rxb by mistake, all references reference the wrong data. It is very troublesome to modify the data. If the logical primary key is used, the problem is well solved. If the business primary key is used, the foreign key data of other tables will be affected. Of course, the problem can also be solved through cascading update, but not all of them can be cascaded.

The main reason for using the business primary key is that adding a logical primary key adds a business-independent field, and users usually search for business-related fields (such as employee ID, ISBN No.) of books .), in this way, in addition to adding an index to the logical primary key, we must also add an index to these business fields, so that the database performance will decrease and the storage space overhead will be increased. Therefore, for basic data that does not often change in the business, using the business primary key is a good choice. On the other hand, for basic data, the addition, deletion, and modification are usually relatively small, so the overhead of this part will not be too much. If you are worried about changing the business logic at this time, you can also consider using the logical primary key, which requires a detailed analysis of the problem.

Another reason for using the business primary key is that user operations are performed through the business field. Therefore, if the logical primary key is used, you must do more operations for ing and conversion. I think this kind of worry is redundant. You can directly query the business primary key to get the result. You don't need to worry about the logical primary key unless the business primary key is not unique. In addition, if the logical primary key is used during design, the primary key will be used as the primary key during encoding. The same primary key is transmitted, processed, and stored in the system, there is no conversion problem. Unless the existing system uses the business primary key, you must change the existing system to the logic primary key. At the moment, I have not thought of any other scenarios such as conversion.

Another reason for using the business primary key is that the security of the banking system is more important than the performance. In this case, we will consider using the business primary key, which can be both a primary key and redundant data, avoid Association loss caused by logical primary keys. If the association between the primary table and the sub-table is lost for some reason, the bank will face irreparable losses. To prevent this situation, the Business primary key must be redundant in important tables. In this case, the best solution is to directly use the business primary key. For example, ID card number, passbook number, and card number. Therefore, the banking system usually requires the use of the business primary key. This requirement is not for the sake of performance, but for the sake of security.

The main reason for using the composite primary key is related to the use of the business primary key. Generally, only one field can be used for the business primary key. For example, if the name field is not enough, add a birthday field. The efficiency of using a composite primary key is very low, mainly because the above situation is similar to that of a large business primary key. In addition, if other tables need to be associated with the table, all the fields of the composite primary key must be referenced. This is not just a performance issue, but also a storage space issue, of course, you can also think that this is a reasonable data redundancy for convenient query, but it feels a little less than worth the candle.

Another reason for using a composite primary key is that a relational table must be associated with the primary keys of two entity tables to indicate the relationship between them. You can combine these two primary keys to form a composite primary key. If two entities have multiple relationships, you can add an ordered field to form a composite primary key. However, this introduces the disadvantages of the Business primary key. Of course, you can also add a logical primary key to this relational table, which avoids the disadvantages of the Business primary key and facilitates the reference of other tables.

In general, most people on the internet tend to use logical primary keys, but few people should agree with the entity table using composite primary keys. People who support business primary keys often misunderstand that logical primary keys must be meaningful to users. In fact, logical primary keys are only used inside the system and do not need to be known to users.

??

Conclusion or inference:

1. Avoid using the business primary key whenever possible and use the logical primary key whenever possible.

2. If you want to use a business primary key, you must ensure that the probability of business logic changes related to the business primary key is 0, and the Business primary key is not too large, and the Business primary key cannot be changed by the user.

3. In addition to Relational Tables, try not to use a composite primary key.

??

Best practices for using logical primary keys:

1. It is enough to use it. The lifecycle of the system is limited to 100. The logical primary key data type adopts the following table rules. If you are not sure about the lifecycle, the int type is used.

Data Volume

Data Type

Data size

Generation frequency

Remarks

<128

Tinyint

1 byte

1 entry/year

Low frequency, unreliable, not recommended

<30 thousand

Smallint

2 bytes

27 messages/month

Low frequency, use with caution

<2.1 billion

Int

4 bytes

40 entries/minute

Meeting most situations

<92.2 billion million

Bigint

8 bytes

2.92 million entries/millisecond

Can most of the situations be met?

> = 92.2 billion million

Uniqueidentifier

16 bytes

10 billion million users generate 1 billion entries per millisecond at the same time, which can be generated for 1 billion consecutive years

It can be used for distributed and highly concurrent applications.

2. Generally, the auto-increment mode or newid () mode is used.

3. primary key field names generally use the "table name ID" method to facilitate identification and table join.

4. If the table has a distributed application, you can consider using different starting values and auto-increment with the same step size. For example, if there are three libraries deployed in different places, you can design them as follows:

Start Value

Step Size

1

10

2

10

3

10

The step size is set to 10 to facilitate future expansion, so that the primary key uniqueness can be maintained between different databases and the merging is also convenient.

5. If you need high concurrency or data table migration, you can use the uniqueidentifier type and use the newid () function.

6. You can consider creating a unique index for the business primary key to meet the business needs of business primary key uniqueness.

7. If you need to consider the performance requirements of the business primary key, you can create a clustered index for the business primary key, while the logical primary key can only create primary key constraints and non-clustered indexes.

8. You can use a composite primary key for a relational table. A composite primary key is not used for an object table.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.