Introduction of primary key and foreign key in SQL database

Source: Internet
Author: User

First, what is the primary key, foreign key:

There are several attributes in a record in a relational database, which can be a primary key if one of the attribute groups (note is a group) uniquely identifies a record:

Student Table (School number, name, gender, Class)

Each student's school number is unique, the study number is a primary key

User table (user name, password, logon level)

Where the user name is unique and the user name is a primary key

On-Board recording table (card number, school number, name, serial number)

A single attribute in the record table cannot uniquely identify a record, and the combination of the number and name can uniquely identify a record, so the attribute group for the number and name is a primary key

The serial number in the record table is not the primary key of the score table, but it corresponds with the student's number, and the student's table is the primary key of the student table, then the student number in the score table is the foreign key of the students ' table.

Defining primary and foreign keys is primarily to maintain the integrity of the relational database, summarizing:

A primary key is a unique identifier that determines a record, for example, a record that includes a social security number, name, and age. The ID number is the only person who can identify you, others may have duplicates, so the ID number is the primary key.

A foreign key is used to associate with another table. is a field that determines the record of another table, which is used to maintain data consistency. For example, a field in a table is the primary key of table B, so he can be a foreign key to table A.

Ii. differences between primary key, foreign key, and index

What is the difference between a primary key, a foreign key, and an index?

Definition: Uniquely identifies a record, cannot be duplicated, does not allow the foreign key to be empty table is the primary key of another table, foreign key can have duplicate, can be null value
The field does not have duplicate values, but it can have a null value: to ensure that data integrity is used to connect with other tables, the number of speeds to sort queries is increased: The primary key can only have one
A table can have multiple foreign keys a table can have multiple unique indexes

What is the difference between a clustered index and a nonclustered index? A clustered index must be a unique index. But a unique index is not necessarily a clustered index.

Clustered indexes, where the data is stored directly in the index page, and the non-clustered index is indexed on the index page, which points to the data of the dedicated data page.

Design principle of primary key and foreign key in database

Primary keys and foreign keys are binders that organize multiple tables into an effective relational database. The design of primary keys and foreign keys has a decisive impact on the performance and availability of the physical database. The database schema must be converted from a theoretical logical design to an actual physical design. The structure of the primary key and the foreign key is the crux of the design process. Once the designed database is used in production environments, it is difficult to modify these keys, so it is necessary and worthwhile to design the primary and foreign keys in the development phase.

Primary key:

The relational database relies on the primary key---it is the cornerstone of the database physical pattern. There are only two uses of the primary key at the physical level:

1. Uniquely identify a row.

2. As an object that can be effectively referenced by a foreign key.

Based on these two uses, here are some of the principles that I followed when I designed the primary key for the physical plane:

1. The primary key should not be meaningful to the user. If a user sees data in a connection table that represents a many-to-many relationship and complains that it is useless, it proves that its primary key is well designed.

2. The primary key should be single-column to improve the efficiency of the connection and filtering operations.

Note: People who use composite keys often have two reasons to justify themselves, both of which are wrong. One is that the primary key should have practical significance, however, making the primary key meaningful is simply a convenience to artificially destroy the database. The second is that using this method can use two foreign keys as the primary key in the join table that describes many-to-many relationships, and I also oppose this practice because the composite primary key often leads to bad foreign keys, that is, when the join table becomes another main table from the table, and according to the secondary method above becomes part of the table's primary key, However, it is possible for the table to become the other main table from the table, and its primary key may become a part of the primary key from the table, so pass on, the farther back from the table, its primary key will contain more columns.

Never update the primary key. In fact, there is no reason to update a primary key because it has no other purpose than to uniquely identify a row. If the primary key needs to be updated, the principle that the primary key should not be meaningful to the user is violated.

Note: This principle does not apply to data that is often required for data consolidation when data conversion or multi-database merging.

4. the primary key should not contain dynamically changing data such as timestamps, creation time columns, modified time columns, and so on.

The primary key should be automatically generated by the computer. If a person is to intervene in the creation of a primary key, it will have a meaning other than a single row of identifiers. Once this boundary is crossed, it is possible to assume the motivation to modify the primary key, so that the key means used by the system to link record lines and manage record lines will fall into the hands of those who do not understand the design of the database.

Iv. database primary Key selection policy

When we build the database, we need to specify a primary key for each table, which is a property or attribute group that uniquely identifies a row in the table, a table can have only one primary key , but there can be multiple candidate indexes. Because a primary key uniquely identifies a row of records, you can ensure that you do not have pigtailed errors when you perform data updates and deletions. Of course, other fields can help us eliminate sharing conflicts when we do these things, but it's not discussed here. The primary key, in addition to the above, often forms referential integrity constraints with foreign keys to prevent inconsistent data. Therefore, the database in the design, the primary key played a very important role.

Common database primary Key selection methods are:

• Auto-Grow field

• Manually Grow Fields


• "Comb (Combine)" type

1 Auto-Grow field Many database designers prefer to use the autogrow field because it is simple to use. The autogrow field allows us to add data to the database regardless of the primary key value, and when the record is inserted, the database system automatically assigns it a value to ensure that there is absolutely no duplication. If you use SQL
Server database, we can also use the @ @Identity global variable to get the primary key value assigned by the system after the record is inserted.

Although the auto-grow field saves us a lot of tedious work, the potential problem with it is that it is difficult to pre-populate the values of primary and foreign keys in data buffering mode.

Suppose there are two tables:

Order (OrderID, OrderDate)

Orderdetial (OrderID, LineNum, ProductID, Price)

The OrderID in the order table is an auto-growing field. Now we need to enter an order, including inserting a record in the order table and inserting several records into the OrderDetail table. Because the OrderID in the order table is

autogrow field, we cannot know the value of a record until it is formally inserted into the database, and only after it has been updated will it be known what value the database assigns to it. This can cause the following contradictions to occur:

First, in order to be able to add the correct value to the OrderDetail OrderID field, you must first update the order table to get the OrderID value assigned to the system, and then populate the OrderDetail table with this OrderID. Last Updated

Oderdetail table. However, to ensure data consistency, order and OrderDetail must be updated at the same time under transaction protection, which ensures that both tables are more successful at the same time. It is clear that they contradict each other.

In addition, when we need to replicate data between multiple databases (SQL

The data distribution and subscription mechanism of the server allows us to replicate the data between libraries, and the autogrow field may cause a primary key conflict when the data is merged. Imagine that the order table in one database replicates the number of order tables in another library

According to the library, OrderID should not automatically grow?

Ado. NET allows us to set a field in the DataSet as an autogrow field, but remember that this autogrow field is just a placeholder, and when the database is updated, the values generated by the database are automatically replaced

Ado. NET-assigned value. Therefore, in order to prevent users from misunderstanding, we recommend that you set the auto-growth initial value and the increment in ADO to 1. In addition, in ADO, we can create two tables

DataRelation, when two tables with cascading relationships are updated, the value of the corresponding key for the other table is automatically changed when one table is updated, which greatly reduces our auto-grow field when there are two inter-table updates that have cascading relationships

Bring the trouble.

2 manual growth field since the autogrow field can be such a hassle, we might consider using a manually-grown field, which means that the value of the primary key needs to be maintained by itself, usually by creating a separate table to store the current primary key

The key value. Also in the example above, this time we create a new table called Intkey, which contains two fields, KeyName and keyvalue. Just like a Hashtable, give a keyname, you can know the current keyvalue

What it is, and then manually implement the key-value data increment. In the SQL

You can write such a stored procedure in the server to let the key value process take place automatically. The code is as follows:

CREATE Procedure[getkey]

@KeyNamechar (10),

@KeyValue intoutput as UPDATE intkey SET @KeyValue =keyvalue = KeyValue + 1

WHERE KeyName = @KeyName GO

Thus, by invoking the stored procedure, we can obtain the latest key value to ensure that no duplicates occur. If you set the OrderID field to a manual growth field, our program can be implemented in the following steps: First call the stored procedure, get

A OrderID, and then uses this OrderID to populate the order table with the OrderDetail table, and finally to update the two tables under transaction protection.

Using the manual growth field as the primary key when replicating data between databases, you can ensure that there are no key-value conflicts during data merge, as long as we assign different primary key-value segments to different databases. However, using manual

The growth field increases the roundtrip of the network, and we must obtain the current primary key value by adding a database access, which increases the load on the network and the database, and when in a low-speed or disconnected network environment, this practice

There will be a lot of drawbacks. At the same time, the manual maintenance of the primary key to consider the concurrency conflicts and other factors, which will increase the complexity of the system.

3 using Uniqueidentifiersql server gives us the uniqueidentifier data type and provides a build function newid (
), using NEWID (
) can generate a unique uniqueidentifier. UniqueIdentifier occupies 16 bytes in the database, the probability of repetition is so small that it can be considered as 0. We often see from the registry similar


is actually a uniqueidentifier,windows used to make COM components and interface identifiers to prevent duplication. In. NET pipe uniqueidentifier is called a GUID (Global
Unique Identifier). In C #, you can use the following command to generate a GUID:

Guid u =system.guid.newguid ();

For the above mentioned order and OrderDetail program, if you choose uniqueidentifier as the primary key, we can avoid the above mentioned increase the network roundtrip problem. Generate GUIDs directly from programs to populate the master

Key, regardless of whether duplicates occur.

The uniqueidentifier field also has a serious flaw: first, it is 16 bytes long, is 4 times times the integer, and consumes a lot of storage space. What is more serious is that the generation of uniqueidentifier is not a rule, but to

Indexing (most databases have an index on the primary key) is a very time-consuming operation. Someone has done experiments, inserting the same amount of data, using the uniqueidentifier type data is slower than using the integer data, the

For efficiency reasons, avoid using the uniqueidentifier database as the primary key value as much as possible.

4 using "Comb (Combine)" type since the above three types of primary key type selection strategy have their own shortcomings, then there is no good way to solve it? The answer is yes. By using the comb type (the database does not have

Comb type, it's Jimmy.
Nilsson in his "The cost of GUIDs asprimary Keys" article), you can find a good balance between the three.

The basic design of the comb data type is this: since uniqueidentifier data due to the lack of regularity can be caused by inefficient indexing, affecting the performance of the system, then we can be combined in a way to preserve

The first 10 bytes of uniqueidentifier, with the last 6 bytes representing the time (DateTime) of the GUID generation, so that we combine the time information with the uniqueidentifier, while preserving the uniqueness of the uniqueidentifier

Increased ordering to improve indexing efficiency. Maybe someone will worry that uniqueidentifier reduced to 10 bytes will result in duplication of data, in fact, do not worry, the latter 6 bytes of time precision can reach 1/300 seconds, two comb class

The probability that the type data is exactly the same as the two GUIDs generated in this 1/300 second is exactly the same as the 10 bytes, which is almost impossible! In the SQL
Using SQL commands in the server to implement this idea is:


SET @aGuid =cast (CAST (NEWID () as BINARY (10))

+ CAST (GETDATE () as BINARY (6)) as uniqueidentifier)

After testing, the use of comb key than the use of int key, in the search, insert, UPDATE, delete and other operations are still slow, but more than unidentifier type faster.

The above is a simple introduction to the primary key and foreign key in SQL database, if there are discrepancies, please understand!


Introduction of primary key and foreign key in SQL database

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.