Database Normalization three Paradigm application examples

Source: Internet
Author: User
Tags definition empty new set one table zip

Why is normalization important? At present, many databases have not been standardized for a variety of reasons. Some of these reasons are explained in this article and a claim form for an insurance company is normalized with different forms of normalization (normal form). The changes in the table and the additional tables added during this process make the database more efficient, less error, and easier to maintain.

Database normalization is the practice of optimizing the structure of tables and organizing data into tables so that data can be made more explicit. Normalization allows you to change business rules, requirements, and data without having to reconstruct the entire system.

By changing the way that data is stored-just a little bit-and changing the program that accesses that information, you can eliminate many of the opportunities for error or spam data and reduce the amount of effort necessary to update the information.

One of the real problems of the company can be summed up in a word "we usually do it". We generally like to store information in that way, and we generally allow people to write any information to <insert field name>; the way we typically program. This is usually a bad thing, especially for young and studying companies. However, when there is a new system and a better way to complete the task, sometimes the phrase "in that way the task is done well" may need to be explored and revised again. Canonical data is one of the useful ways companies often use.

While using data for COBOL programs, such as those familiar to any COBOL programmer, storing them in a relational database is similar to storing them in a flat file, the method stored in a flat file is not the best way to accomplish a task. Especially because you do not understand the difference between the two or fear change, but simply bring the idea of the past into the present way.

Note: Dictionary.com is so defined to be standardized: "To make it standard, especially to cause it to conform to a certain standard or specification." "or" mandatory acceptance of a certain standard. " Webopedia that normalization is "the process of organizing data to minimize redundancy in relational database design." Normalization typically involves dividing a database into two or more tables and defining relationships between tables. The goal is to isolate the data so that a field can be added, deleted, and modified only in one table, and then passed through the defined relationships to the remaining tables in the database. I like this definition better.

Terms

Before you know about an insurance company in the real world, you need to know some of the terms that are used in the discussion. When working with databases, especially when dealing with canonicalization issues, a new set of keywords mentioned in the following sections are useful:

· Relationship (relation): Essentially, a relationship is a two-dimensional table or array that contains rows and columns.

· Association (Relationship): An association is a method of associating data between different tables. The association exists between the data items that form different entities and the table entity itself, which constitutes the basic core problem of database normalization. There are three basic types of data associations, and it is important to understand them:

One-to-one (1:1): A one-to-one association means that any given instance of each (rather than most) is closely aligned with an instance of another entity. Each person has only one correct fingerprint is unique. Each phone number is accurately aligned with an independent private customer who pays the bill (not the company). Everyone in the United States has only one social security number.

One-to-many (1:m): A one-to-many association means that one instance of a given entity can be associated with 0 instances of another entity, one instance, or multiple instances. Each person may not have a child, have a child or many children. Everyone may not have a car, a car or many cars.

Many-to-many (m:n): a Many-to-many Association (0, one, or more instances of a given entity associated with 0, one, or more instances of another entity) is a straightforward simulation of a complex association that is often decomposed into multiple 1:m associations. Because multiple families are mixed together, one or more children may not have parents (orphans), one parent (single parent), more than one parent (two still together or divorced two parents, or divorced and remarried parents). A house or property may be transferred to a person or persons, who (one or more) may have one or more houses or property in their wills.

· Attribute: A property is considered to be a modifiable attribute or feature of certain components in a program or database and can be set to a different value or a relationship or a column in a table.

· Tuple:tuple is a set of values or value attributes that are sorted in a relational or non relational database: a row in a relationship.

· Delete exception: Delete exception refers to the data caused by intentional deletion of other data contradictory or unexpected data (information) loss.

· Insert Exception: Insert exception refers to the inability to add information to the database due to lack or lack of data.

· Update exception: Update exception refers to data inconsistency due to data redundancy or incomplete update of redundant data.

· Decomposition of a relationship: the decomposition of a relationship is the decomposition of a relationship into multiple relationships so that the relationship conforms to a higher paradigm.

· Data redundancy: Data redundancy means that there is no necessary duplication of data in the database.

· Data integrity: Data integrity refers to the consistency of data in a database. It is important to ensure data integrity so that users know that the data they rely on is correct, and that the results of their queries and procedures are accurate and expected.

· Atomic value: An atomic value is a value that is neither a set of values that can be further split nor a repeating group. Each column has a full value, but only one value--the value cannot be decomposed into parts that are either used by the database or accessed by users of the database.

· Referential integrity rules: Referential integrity rules mean that values stored in Non-empty external health must be key data items in a relationship.

· External health: An external health is a set of attributes (one or more columns) in a relationship and is also a primary key in some (same or other) relationship. It is a logical link between relationships. The external health reference to its own relationship is called a recursive external health.

· Functional dependencies: Functional dependencies mean that the value of one property in a row is determined by the value of another property in that row. This usually occurs between the primary key (which makes a unique piece of information for a row) and other information for that row. The combination of city and state relies on zip (postal) code, even though many zip codes in a given state are associated with a city. Every legal person in the United States relies on his social security number.

· Deterministic: function relies on the left attribute to determine the value of other attributes in the row (the ZIP code determines the city and state; The social security number determines the person's identity; the license number and the state determine the owner of the car).

· Entity integrity rules: an entity integrity rule means that the key attribute of a row may be empty (if you have a zip code in a city; If you have a car, you have a license number).

· Constraints: A constraint is a rule that qualifies a value in a database. The telephone number must be numeric, the number of dollars must be a number, the state must be a lawful states or province; the country must be a lawful state; the date cannot be February 31.

Now that you know a lot of relevant terminology, we can look at the meaning of the specification in the relevant terminology. The following example is not a typical employee-manager-department example or student-professor-course-provided sample. I will demonstrate a hypothetical database of insurance companies. The tables in the database are much more complex than those used in this example, but they are similar to what people encounter.

Figure 1 shows the non-standard definition of the claims (claim) table. Although there are many more tables in an insurance company's database, these tables provide us with some background, through which we can see normalization and its branches. Keep in mind that the examples in each chapter have only a partial column, simplifying the example and making it easy for you to see what is changing.

Claim_num, Occurance_num, Claim_status, Accdnt_yr, Accdnt_dt, Reported_dt, Entered_dt, Claim_dt1, CLAIM_DT2, CLAIM_DT3, CLAIM_DT4, Claim_dt4, Claim_dt5, Claim_dt6, Claim_dt7, Claim_dt8, Claim_dt9, claim_dt10, Closed_dt, DEATH_DT, ASSIG Ned_dt, ADJSTER_CD, Adjuster_name, AGENT_CD, AWARD_CD, CAUSE_CD, Cause_desc, LOCATION, SITE, COVERAGE_CD, Coverage_ DESC, Ded_recov, Deductible_remain, Paid_1, Reserved_1, Paid_2, Reserved_2, Paid_3, Reserved_3, Paid_4, RESERVED_4 , Paid_5, Reserved_5, Paid_6, Reserved_6, paid_7, reserved_7, Paid_8, Reserved_8, Paid_9, Reserved_9, PAID_10, Reserved_10, LEGAL_FLG, KEY1, KEY2, KEY3, KEY4, KEY5, KEY6, KEY7, KEY8, KEY9, KEY10, SEVERITY_CD, Policy_num, Payment_n UM, SSN, State, Actvy_dt, Entry_dt, ADMIN_CD, Admin_desc, Reopen_dt, Insured_name, insured_address, Insured_phone, INSURE D_city, Insured_state, Insured_zip, Claimant_name, claimant_address, claimant_city, Claimant_state, CLAIMANT_ZIP, Claimant_phone, Special_dt_1, SPECial_dt_2, Special_dt_3, Special_dt_4, Special_dt_5, Special_dt_6, special_dt_7, Special_dt_8, SPECIAL_DT_9, SPECIAL_D T_10, GROSS_PD, policy_id

Figure 1: Columns of the not-normalized claim table

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.