Normalization-Database Design principles _ Normalization

Source: Internet
Author: User
Summary

Relational database is a kind of database which is widely used at present, the design of relational database is the process of organizing and structuring data, the core problem is the design of relational model. For the smaller database size, we can easily handle the table structure in the database. However, as the scale of the project grows, the database becomes more complex and the relational model table structure is more complicated, and we tend to find that the SQL statements we write are clumsy and inefficient. Worse, because the table structure is poorly defined, it can cause incomplete data when the data is updated. Therefore, it is necessary to learn and master the normalization process of the database to guide us to better design the table structure of the database, reduce redundant data, which can improve the storage efficiency of the database, data integrity and scalability. This article will combine the concrete example, introduces the database normalization flow. Preface

The purpose of this paper is to elaborate the standardized database design principles through a detailed example. In DB2, concise and structure-clear table structure is very important to the design of database. A normalized table structure design, in which future data maintenance does not occur when inserts (insert), delete (delete), and update (update) are abnormal. On the contrary, the database table structure design is unreasonable, not only will bring a variety of problems to the use and maintenance of the database, and may store a lot of unnecessary redundant information, waste system resources.

To design a standardized database requires us to do so according to the database design paradigm-the normative principle of database design. But some of the relevant materials mentioned in the form of design, is often given a large number of formulas, which gives the designer's understanding and application of a certain difficulty. Therefore, this article will combine the concrete image example, as far as possible to describe the three paradigms, and how to optimize the application in the actual project. Standardization

The key step in designing and operating the maintenance database is to ensure that the data is correctly distributed to the tables in the database. Using the correct data structure not only facilitates the appropriate access to the database, but also greatly simplifies other aspects of the application (queries, forms, reports, code, and so on). The formal name for proper table design is "database normalization". Later we will illustrate the specific standardization of the project through examples. Refer to Appendix 1 for a definition of what a paradigm is. Data redundancy

The data should be as little redundant as possible, which means that duplicate data should be minimized. For example, a department employee's phone should not be stored in a different table because the phone number here is an attribute of the employee. If there is too much redundant data, this means taking up more physical space, there are also problems with data maintenance and consistency checking, and when this employee's phone number changes, redundant data can cause updates to multiple tables, and if one of the tables is unfortunately ignored, it can cause inconsistencies in the data. Normalize instances

In order to illustrate the convenience, we will use a sample data table in this article to analyze the normalization process step by step.

First, let's start by generating a first table. 1 2 3 4 5 6 7 8 9         &nbsp CREATE TABLE "SAMPLE" ;  "Prjnum" INTEGER not NULL,            "Prjname" VARCHAR (200 ),            "Emynum" INTEGER not NULL,             "Emyname" VARCHAR,             "Salcategory" CHAR (1),            "Salpackage" INTEGER)              in "USERSPACE1";   ALTER TABLE "SAMPLE"      ADD PRIMARY KEY          ( "Prjnum", "Emynum");   Insert into SAMPLE (Prjnum, Prjname, Emynum, Emyname, salcategory, Salpackage) VALUES (100001, ' TPMS ', 200001, ' Johns On ', ' A ', 100001, ("TPMS", 200002, ' Christine', ' B ', 3000, (100001, ' TPMS ', 200003, ' Kevin ', ' C ', 4000 '), (100002, ' TCT ', 200001, ' Johnson ', ' A ', Watts), (100002, ' TCT ') , 200004, ' Apple ', ' B ', 3000; Table 1-1

Looking at table 1-1, we can see that there are six fields in this table, and that there are duplicate values in each field, which means there is a problem with data redundancy. This will potentially cause anomalies when data operations, such as deletes, updates, and so on, are required for normalization. First Paradigm

Referring to the definition of the reference paradigm, we find that the table has met the requirements of the first paradigm.

1, because this table is a single attribute of the field, can not be divided;

2, and the records of each row are not duplicated;

3, there are main attributes, and all attributes are dependent on the main attribute;

4, all the main properties have been defined

In fact, in all of the current relational database management systems (DBMS), the first paradigm is enforced when the tables are being built. Therefore, this sample table is already a table that satisfies the requirements of the first normal form. Looking at table 1-1, we first need to find the primary key. As you can see, the attribute to <project number, Employee number> is the primary key, all other attributes depend on the primary key. Transition from one paradigm to two paradigm

According to the definition of the second paradigm, the conversion to the two paradigm is to eliminate partial dependencies.

In table 1-1, we can see that the <project name> part relies on the <project in the primary key number>; Non-primary attributes <employee name>,<salary category> and <salary package> are all partially dependent on the <employee in the primary key number>;

In the form of table 1-1, the following potential problems exist:

1. Data redundancy: Each field has a duplicate value;

2. Update exception: For example, the value of <project name> field, such as the value "TPMS" modified, you need to update multiple values of the field;

3. Insert Exception: If you create a new project with the name TPT, but no Employee joins, then <employee Number> will be vacant and the field is part of the primary key, so there will be no

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.