Database Normalization and Design techniques

Database Normalization and Design techniques _ database Other

Last Update:2017-01-18 Source: Internet

Author: User

Tags oracle database

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In the design of dynamic website, the importance of database design is self-evident. If the design is improper, the query is very difficult, the performance of the program will be affected. Whether you're using MySQL or an Oracle database, you can make your PHP code more readable and extensible by formalizing the table design, which also improves the performance of your application.
Simply put, normalization is the elimination of redundancy and uncoordinated dependencies when designing tables. In this article, I'll go through five incremental processes to tell you the regularization techniques you should know in your design. Thus, a feasible and efficient database is established. This article also gives a detailed analysis of the types of relationships available.

This assumes that we want to create a table of user information that stores the user's name, company, company address, and some personal favorites or URLs. At the beginning, you might define a table structure as follows:

0 State Form

Users
Name company Company_address URL1 URL2
Joe ABC 1 Work Lane abc.com xyz.com
Jill XYZ 1 Job Street abc.com xyz.com

Since no normalization has been done, we refer to this form of table as a table in 0 State form. Notice where the URL1 and URL2 fields---if we need a third URL in the application? So you need to add one more column to the table, which is obviously not a good idea. If you want to create an extensible system, consider using the first formalized form and applying it to the table.

First Level regularization form

1. Eliminate repeated groups in each table
2. Create a separate form for each set of related data
3. Use a primary key to identify each set of related data

The above table clearly violates the first rule above, then what does the third key mean? Quite simply, it simply adds a unique, automatically incremented integer value to each record. With this value, you can distinguish between records with the same two names. By applying the first level normalization form, we get the following table:

Users
UserId name company company_address URL
1 Joe ABC 1 Work Lane abc.com
1 Joe ABC 1 Work Lane xyz.com
2 Jill XYZ 1 Job Street abc.com
2 Jill XYZ 1 Job Street xyz.com

Now our table can be said to be in the form of the first level of normalization, it has solved the restrictions of the URL field, but this process has brought a new problem. Every time we insert a record in the user table, we have to repeat all the company and user data. This not only makes the database larger than it was before, but it is prone to error. It is therefore subject to a second level of regularization.
Second level regularization form

1. To create a separate table for fields that apply to more than one record
2. To associate the values of these tables with a foreign key

We put the value of the URL in a separate table so that we can add more data at a later time without having to worry about producing duplicate values. We also associate these fields with primary key values:

Users
UserId Name Company Company_address
1 Joe ABC 1 Work Lane
2 Jill XYZ 1 Job Street

URLs
Urlid Reluserid URL
1 1 abc.com
2 1 xyz.com
3 2 abc.com
4 2 xyz.com

As shown above, we have created a separate table in which the primary key userid in the users table is now associated with the foreign key reluserid in the URL table. It seems that the situation has improved markedly. But what if we have to include an employee record for ABC company? Or more, 200? So we have to reuse the company name and address, which is obviously not redundant. So we will apply the third level regularization method:

Third-level regularization form

1. Eliminate fields that are not dependent on the key

The company name and address are not related to the user ID, so they apply with their own company ID:

Users
UserId name Relcompid
1 Joe 1
2 Jill 2

Companies
Compid company Company_address
1 ABC 1 Work Lane
2 XYZ 1 Job Street

URLs
Urlid Reluserid URL
1 1 abc.com
2 1 xyz.com
3 2 abc.com
4 2 xyz.com

In this way we associate the primary key comid in the companies table with the foreign key named Relcompid in the users table, and even if we add 200 employees to the ABC company, there is only one record in the companies. Our users and URLs tables can be constantly expanded without worrying about inserting unnecessary data. Most developers believe that three-step normalization is enough, and that the database design can easily handle the burden of the entire enterprise, which is true in most cases.

We can pay attention to the URL field-do you notice the redundancy of the data? If the HTML page that gives user input to these URL data is a text box, can be entered arbitrarily, this is not a problem, two users enter the same favorites less probability, however, if it is through a drop-down menu, only let the user select two URL input, or more. In this case, our database can also be optimized at the next level--step fourth, for most developers, this step is ignored, because it relies on a very special relationship-a many-to-many relationship that has not been encountered in our applications yet.
Data relationships

Before defining the fourth normalization form, I'd like to start by mentioning three basic data relationships: one-to-one, One-to-many and Many-to-many. Let's look back through the first normalized users table. If we put the URL field in a separate table, and each time we insert a record in the Users table, we insert a row in the URLs table. We'll get a one-to-one relationship: each row in the user table will find the corresponding row in the URLs table. For our applications, this is neither practical nor standard.

Then take a look at the second normalization example. For each user record, our table allows a record of multiple URLs to be associated with it. This is a one-to-many relationship, which is a very common relationship.

It's a little more complicated for a many-to-many relationship. In our third form of normalization, one of our users is related to a lot of URLs, and we want to change the structure to allow multiple users to relate to multiple URLs so that we can get a many-to-many structure. Before we go into this discussion, let's see what happens to the table structure.

Users
UserId name Relcompid
1 Joe 1
2 Jill 2

Companies
Compid company Company_address
1 ABC 1 Work Lane
2 XYZ 1 Job Street

URLs
Urlid URL
1 abc.com
2 xyz.com

Url_relations
Relationid Relatedurlid Relateduserid
1 1 1
2 1 2
3 2 1
4 2 2

To further reduce the redundancy of the data, we apply the fourth level normalization form. We created a rather strange url_relations table with fields that are either the primary key or the foreign key. With this table, we can eliminate duplicate items from the list of URLs. The following are specific requirements for the fourth formalized form:

Fourth normalization form

1. In a many-to-many relationship, a separate entity cannot be stored in the same table

Since it applies only to many-to-many relationships, most developers can ignore this rule. In some cases, however, it is very useful, and this is the case where we have improved the URLs table by separating the same entities and moving the relationships into their own tables.

To make it easier for you to understand, for example, here's a SQL statement that will select all of the URLs that belong to Joe:

SELECT name, URL from users, URLs, url_relations WHERE Url_relations.relateduserid = 1 and Users.userid = 1 and Urls.urlid = Url_relations.relatedurlid

If we want to traverse everyone's personal and URL information, we can do this:

SELECT name, URL from users, URLs, url_relations WHERE users.userid = url_relations.relateduserid and urls.urlid = Url_rel Ations.relatedurlid

Fifth level regularization form

There is also a formalized form, which is not common, somewhat esoteric, and is not necessary in most cases. Its principles are:

1. The original form must be rebuilt by the form it separates from

The advantage of using this rule is that you can make sure that no extra columns are introduced into the detached table, and that all of the table structures you create are as big as their actual needs. It's a good habit to apply this rule, but you won't need to use it unless you're dealing with a very large data.

Hopefully this article will be useful to you and can help you apply these regularization rules to all projects. You might want to know where these methods come from, I can tell you that the previous three regularization provisions are in 1972, Dr. E.f Codd in his paper "Further regularization of the database model of the relationship" proposed, the rest of the provisions are based on the later set theory and relational mathematicians theoretical. Comment: The so-called material level will be reversed, it is sometimes not good to divide a table into details because it requires a variety of associations between the tables, which makes the query more complex and less efficient, and these regularization provisions can be referenced in practical applications, depending on the size of the project and, if necessary, some testing. To design a more reasonable table structure.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More