How to effectively design the database for developers

Source: Internet
Author: User

The database design occupies the important position in the software development process, the domestic developer Meteorseed in the blog unifies own actual experience comprehensively summarizes each aspect which the relational database design needs to pay attention, including Codd Basic Law, the design stage, the design principle and the naming rule.

Meteorseed that the database should be designed by developers early in the project, and that later tuning would require a DBA: "A developer who is proficient in OOP and ORM, designs a database that is often more reasonable and adaptable to changes in demand." He cites the 12 rules of the parent codd of relational databases as a guiding guideline for database design:

  1. Information Law
    All information in a relational database is represented in a unique way-the values in the table.
  2. Guaranteed access laws
    The combination of table name, primary key value, and column name guarantees access to each data item.
  3. The systematic processing of null values
    Null values are supported and null values are handled in a systematic manner, and null values do not depend on the data type.
  4. Dynamic online catalog based on relational model
    The description of the database should be self-describing, at the logical level and the normal data in the same way, that is, the database must contain a description of the database structure of the system table or database description information should be included in the user can access the table.
  5. Uniform data sub-linguistic rules
    A relational database system can support several languages and multiple terminal uses, but must have at least one language whose statements can be represented as strings in a well-defined syntax and fully support all of the following rules: Data definition, view definition, data manipulation, constraints, authorization, and transactions. (This language is SQL)
  6. Rule of view Update
    All views that can theoretically be updated can also be updated by the system.
  7. Advanced INSERT, UPDATE, and delete operations
    The ability to handle an underlying relationship or derivation as a single operand is not only suitable for retrieving data, but also for inserting and modifying data, i.e. data rows are treated as collections in insert, modify, and delete operations.
  8. Physical independence of data
    Regardless of how the database's data changes in the way it is stored or accessed, both the application and terminal activity maintain a logical invariance.
  9. Logical independence of data
    The application and terminal activity are logically invariant when the table is made theoretically without compromising the change in information.
  10. Independence of data integrity
    The integrity constraints that are specific to a relational database must be defined in a relational database sub-language and can be stored in a data directory, not in a program.
  11. Distribution independence
    The RDBMS's data manipulation sub-language must be able to keep the application and terminal activity logically invariant, regardless of whether the data is physically distributed or if it changes the distribution strategy at any time.
  12. Non-destructive laws
    If a relational database system supports a low-level (single-record-processing) language, this low-level language cannot violate or circumvent the integrity rules or constraints imposed by higher-level languages (processing multiple records at a time), that is, users cannot violate database constraints in any way. 、

Meteorseed the database design stage into the planning stage, the concept stage, the logical stage, the realization stage and the physical phase. On the principle of design, he elaborated his experience from the following aspects:

      • Reduce the reliance on database functionality
        The function should be implemented by the program, not the DB implementation. The reason is that if the function is implemented by DB, once the replacement DBMS is not as powerful as the previous system and some functionality cannot be implemented, then we will have to modify the code. Therefore, in order to prevent this kind of situation, the function should have the program implementation, the database is only responsible for the data storage, to achieve the lowest coupling.
      • Principles for defining entity relationships
        When defining a relationship between an entity and another entity, you need to consider the following:
        • The entity involved identifies all the entities involved in the relationship.
        • ownership takes into account the situation in which an entity "owns" another entity.
        • Cardinality considers the number of instances of an entity associated with another entity instance.

Relationship and number of tables

        • Describes a minimum of 1 tables for 1:1 relationships.
        • A minimum of 2 tables is required to describe the 1:n relationship.
        • A minimum of 3 tables is required to describe an n: = relationship.
      • Column means a unique value
        If you are representing coordinates (0,0), you should use two columns instead of "0,0" in 1 columns.
      • Order of columns
        The order of the columns is irrelevant to the table, but it is customary to sort the columns in the order of "primary key + foreign KEY + Entity Data + non-Entity data", which obviously makes the column more readable.
      • Defining primary keys and foreign keys
        The data table must define the primary key and foreign key (if there is a foreign key). Defining primary keys and foreign keys is not only an RDBMS requirement, but also a development requirement. Almost all code generators require this information to generate code for common methods (including SQL and references), so defining primary and foreign keys is a must in the development phase. The reason is that in the development phase is necessary because, a lot of team for performance considerations after a large number of tests, in order to ensure that the referential integrity will not have a large defect, the DB will be removed all foreign keys to achieve optimal performance. Meteorseed that the foreign keys should be preserved when performance is not a problem, and that even if performance is a problem, the SQL should be optimized instead of discarding foreign key constraints.
      • Select key

Artificial keys and natural keys. Artificial key--an unnatural attribute of an entity, imposed by a person as required, such as a GUID, which has no meaning to the entity; natural key--the natural properties of the entity, such as the ID number. The benefits of a manual key: The key value will never change, and will always be a single row store. Disadvantages of a manual key: Because a manual key is a unique value that has no practical meaning, you cannot avoid repeating rows by using a manual key. Meteorseed recommends that you use all manual keys. The reasons are as follows:

        • In the design phase we cannot predict what the code really needs, so simply discard the guess key and use the manual key.
        • The design of a manual key that is complex to handle entity relationships and not responsible for any attribute descriptions is highly decoupled from entity and entity content.

Another suggestion for Meteorseed is that each table requires a natural key that is meaningful to the user, and may not be able to find such an item in a particular case, and a composite key can be used at this time. This key I don't use as a unique identifier in my program, but it can be used when querying directly against a database. Another disadvantage of using manual keys stems from the consideration of query performance, so it is important to choose the form of the manual key (the type of the column):

        • Self-increment type, because the type of lightweight query efficiency is better, but the value is limited.
        • The GUID query efficiency is inferior to the value type, but the value is unlimited, and the developer is more cordial.

Smart Health and non-smart keys. Smart Key--the key value contains additional information, which is encoded according to a well-defined coding specification, which can obtain certain information from the key value itself, non-intelligent key, simple meaningless key value, such as the increment number or GUID. The smart key is a double-edged sword, and the developer favors the key value that contains the information, and the program looks forward to the underlying data; The database administrator or designer hates the smart key, and the reason is that the smart key is a potential risk to the database. As mentioned earlier, one of the principles of database design is not to implement a combination of values that have independent meanings into a single column, and you should use multiple independent columns. The Database Designer wants developers to get smart keys by stitching multiple columns, which are used in the form of composite primary keys for developers instead of decomposing the values of a column. Developers should accept this database design, but many developers do not understand the advantages of both. Meteorseed that the use of a single column to implement smart keys is a risk that we may not be able to anticipate in the design phase that coding rules may change at a later stage. For example, the value of the local key that makes up the smart key is used up and causes the change of rule or length, and the change of the coding rule is destructive to the validation of the program and the Intelligent key parsing, which is the last thing the system operators want to see. So Meteorseed recommends that if smart keys are needed, encapsulate them in the business logic layer (using read-only attributes), and do not persist the layer implementations to avoid the above problems.

In addition, Meteorseed also discusses the design principles from "whether to allow nulls", attribute cutting, normalization (paradigm), selecting data type, and optimizing parallelism. For more information, you can view the original blog of Meteorseed.

How to effectively design the database for developers

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.