Personal Summary of data design

Source: Internet
Author: User

First, what is the database design
    1. Database Design Concepts
      Database design is based on the specific needs of the business system, combined with the selected DBMS, for this business system to construct an optimal data storage model. and establish the structure of the table in the database, as well as the relationship between the table and the process of association. So that it can effectively store The data in the application system, and can efficiently query access to the data already stored.
    2. Design comparison
Excellent design design of the slot cake
Reduce data redundancy There is a lot of data redundancy
Avoid Data maintenance anomalies There are data insertions, updates, delete exceptions
Save storage space (occasional space change time) Waste a lot of storage space
Efficient access Access data inefficiencies
Second, the database design steps
  1. Pre-design nagging
    Before designing a database, we know that the database as a whole needs to follow several paradigms, but the actual situation will be flexibly adjusted. Overall, the design of the database to do: No operational anomalies, access to efficient, relative to the space (for the moment, the cost of space is the lowest)
    conceptual additions to Operation exceptions:

    concept Supplement for data redundancy
    means that the same data exists in multiple places, or that a column in a table can be computed by another column, which means that there is data redundancy in the table. (Inconvenience to consistency)
  2. Basic Steps in Design
    1). Demand Analysis
    What is the main content of the requirements analysis: What data is in the business system? What are the attributes of this data? What are the characteristics of the data and attributes?
    2). Logical design
    Use the ER diagram design tool to model ER diagram logic for a database: First, the requirement is transformed into a logical model of the database, and secondly, the logical model description is presented in the form of ER diagrams, and finally the logical design is independent of the selected DBMS, meaning that this logic design should be suitable for all DBMS.
    3). Physical Design
    This phase is entered into the DBMS-related phase, so you first need to select the DBMS and convert the logical model of the second step to the physical model.
    This phase involves building tables in the database and choosing the type of field. In MySQL, follow these guidelines:
    The data type of a column affects the cost of the storage space of the corresponding data on the one hand, and the performance of data query. When a column can select more than one data type, it should take precedence over the number type, followed by the date or binary, and finally the string type. For data types of the same level, it is preferable to select a data type that takes up less space when the business is satisfied.
    Some field space usage in MySQL:

    Additional points to note:

      • When you compare data (where, join, order by), the same data, string processing is slower than numeric processing.
      • In MySQL, UTF-8 occupies 3 bytes.

    4). Maintenance optimization

      • Create a new database table for the new requirements ( added here, in the initial design process of the database, it is not recommended to reserve an indeterminate field in the table, this does not achieve a good database extensibility problem, it is the same cost as a new field later, or even larger)
      • Index optimization
      • Large table split: Split is divided into two different ways
        1. Horizontal split: Controls the length of the table, that is, the number of rows of data. The amount of data in each table will be reduced.
        2. Vertical split: The storage in the database is in pages, when the width of each row is less than the number of columns, the content is stored on each page, and the IO efficiency is high (the data is stored on disk, the more content of each IO is better). Therefore, the vertical splitting of tables is often done for very wide tables. The amount of data that is removed from the table should not change, but the number of tables increases and the width of each table decreases.
Iii. several paradigms in the database
    1. First Paradigm (1NF)
      Concept: All field values in a database table are non-exploded atomic values, which means that the database table satisfies the first paradigm.
      Function: ensures that each column remains atomic
      Example: Normally, the design address, only need a field, but if in the actual business system need to access the country, province, city, this time should be the address this field to be split to meet 1NF.
    2. Second Paradigm (2NF)
      Concept: Based on 1NF, the second paradigm needs to ensure that each column in a database table is related to a primary key, not just one part of the primary key (primarily for federated primary keys). In other words, in a database table, only one data can be saved in a table, and multiple data cannot be saved in the same database table. Also: All tables in a single key field conform to the second normal form.
      Role: ensure that each column in the table is associated with the primary key
      Example: In an order form, because there may be multiple items in the order, the order number and the product number are used as the federated primary key for the database table, as shown in the following table:
order number item number product name number unit price customer Contact
001 1 Excavator 1 Console 12000.00 Tom 028-88886666
001 2 Impact Drills 8 Put 260.00 Tom 028-88886666
002 3 Forklift 3 Car 32000.00 John doe 028-88886666

Problem: This table is the combined primary key with the order number and the product number. In this table, the product name, unit, commodity price and other information is not related to the table's primary key, but only related to the product number. So this violates the design principle of the second paradigm.
Solution: The above table to be split, order information to save a table, the order of goods owned by the table, the last product information a table
Order Information

Order Number Customer Contact Information
001 Tom 028-88886666
001 Tom 028-88886666
002 John doe 028-88886666

Order Items

Order Number Product number Quantity
001 1 1
001 2 8
002 3 3

Product Information

TD align= "Center" >12000.00
item number product name units commodity price
1 excavator table
2 impact drill 260.00
3 forklift 32000.00

3. Third paradigm (3NF)
Concept: Based on 2NF, if the data table does not have a non-critical field on any candidate key field of the transfer function dependency is in accordance with the third paradigm.
Role: ensure that each column is directly related to the primary key column, rather than indirectly related
Example: When designing an order data table, the customer number can be used as a foreign key and order table to establish the corresponding relationship. Instead of adding fields to the order form about other customer information (such as name, company, etc.). This reduces data redundancy ( but in real-world development, it tends to increase the redundancy of some customer information in the order table for ease of querying )
4. BC Paradigm (BCNF)
Concept: On the basis of the third paradigm, if no field exists in the database table, the transfer function dependency on any of the candidate key fields conforms to the BC paradigm. That is, if you are a composite keyword, you cannot have a function dependency between the composite keywords.
Role: Ensure that any column is directly related to the primary key column, rather than indirectly related
Example: When designing a vendor and Vendor Contact table, the vendor contact can only work in one vendor, and the supplier may supply multiple items. So choose: Vendor + Contact As the Federated primary Key, however there is a transitive dependency between the two
problems table:

supplier Product ID Supplier Contact person Number of goods
Beverage One factory 1 Tom 10
Beverage One factory 2 John doe 20
Beverage One factory 1 Harry 10

This table has the following primary key dependencies: Vendor--Vendor Contact Supplier
Workaround:
Suppliers

supplier Product ID Number of goods
Beverage One factory 1 10
Beverage One factory 2 20
Beverage One factory 1 10

Vendor Contact

supplier Supplier Contact person
Beverage One factory Tom
Beverage One factory John doe
Beverage One factory Harry
Four, the realm of master is no recruit wins have recruit

In order to design a good database, we need to follow the paradigm of the database, but sometimes if you design a database that exactly follows these paradigms, it will degrade some of your performance. Here are a few things to ask:
-Database connectivity is a part of the performance penalty, so sometimes in order to reduce redundancy, storing data in multiple tables tends to degrade query performance, while the world of the Internet, read and write ratios are roughly 3:1 or even 4:1.
-Reduce the number of associations between tables and tables (reduce the IO to disk) and increase the read efficiency of the data.
-The inverse paradigm must be moderate. Everything is too counterproductive.
Examples of inverse paradigm design:
Follow the Paradigm design table:
User table (user ID, name, phone, address, zip code)
Order form (Order ID, user ID, placing time, payment type, order status)
Order Product list (order ID, item ID, item quantity, commodity price)
Commodity table (item ID, name, description, expiry time)
Inverse paradigm Design Table
User table (user ID, name, phone, address, zip code)
Order form (Order ID, User ID, order time, Payment type, order status, orders price , name , address , phone )
Order Product list (order ID, Product ID, product quantity, commodity price, product name , expiry time )
Commodity table (item ID, name, description, expiry time)

The benefits of the above anti-paradigm design:
When you need to inquire about the purchase information of the order, we need to get the order number, find the user ID in the order form, then go to the user table to query the user information, then go to the order list to find the product ID, and then to the product list to find the product information. This needs to query 4 tables, through the anti-paradigm design, now only need to get orders and user information in the order form, and then into the order list, you can get orders and merchandise information. From 4 tables to 2 tables, it is worthwhile to use space to change time.
Of course, there is a need to be aware of the update operation.

Personal Summary of data design

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.