Design principles and development skills for large databases

Source: Internet
Author: User

As computer technology is increasingly widely used in various fields of the national economy, while computer hardware is constantly miniaturization, application systems are evolving towards a more complex and large-scale direction. Database is the core of the entire system. Its design is directly related to the efficiency of system execution and system stability. Therefore, in software system development, database design should follow the necessary database Paradigm Theory to reduce redundancy and ensure data integrity and correctness. Only by designing a proper database model on a suitable database product can the programming and maintenance of the entire system be reduced and the actual operating efficiency of the system be improved. Although developers of small or medium-sized projects can easily use the Paradigm Theory to design a set of qualified databases, for a software project that contains large databases, you must have a complete set of design principles and skills.

1. Set up a Data Group
 
Large Databases have many data elements, so it is necessary to set up a special data group in design. Because the Database Designer is not necessarily a user, it is impossible to fully consider the data elements in the system design. After the database is designed, it is often difficult to find the required database table, therefore, it is best for a data group to be composed of a project backbone familiar with the business.
 
The function of a Data Group is not to design a database, but to extract the basic data elements of the system based on the Requirements Analysis of other similar systems and review the database. The review includes checking whether new database elements are complete and whether all business needs can be fulfilled; analyzing and transforming old databases (if existing systems exist; review, control, and necessary adjustments of database design.

Ii. Design Principles

1. standard naming. All Database names, table names, and domain names must follow the unified naming rules and must be described to facilitate the design, maintenance, and query.
 
2. Reference of control fields. During the design, you can select appropriate database design management tools to facilitate the distributed design of developers and centralized audit management of data groups. Uniform Naming rules are adopted. If the design field already exists, it can be directly referenced; otherwise, it should be re-designed.
 
3. Duplicate control of database tables. During the design process, if most fields are found to already exist, the developer should doubt whether the database table designed already exists. By querying the database table where the field is located and the corresponding designer, you can check whether the database table is indeed repeated.
 
4. Concurrency Control. Concurrency Control should be implemented in the design, that is, for the same database and table, only one person has control at the same time, and others can only query.
 
5. necessary discussions. After the database design is complete, the data team should discuss with relevant personnel and familiarize themselves with the database through the discussion, so as to control the existing problems in the design or obtain the necessary information for the database design.
 
6. Review by the data team. The revision and modification of database tables must be reviewed by the data team to ensure they meet the necessary requirements.
 
7. header file processing. After each data modification, the data team should modify the corresponding header file (which can be automatically completed by the management software) and notify the relevant developers to make corresponding program modifications.

Iii. design skills

1. Split tables with large data volume by category. For frequently-used tables (such as some parameter tables or code comparison tables), we should minimize the number of records in the table because of their high usage frequency. For example, the Bank's account owner table was originally designed as a table, although it can facilitate program design and maintenance, but after analysis, it is found that because of the large amount of data, it will affect the Rapid Positioning of data. If the owner account table is designed as the current owner account, the regular owner account, and the public owner account, the query efficiency can be greatly improved.
 
2. index design. For large database tables, reasonable indexes can improve the operation efficiency of the entire database. In the index design, fields with fewer duplicate values should be selected for index fields. When searching fields with compound indexes, you should pay attention to the order in which compound index fields are created. For example, if you create a composite index for a flow meter with more than 50 thousand records in order of date and serial number, because the number of duplicate values of dates in the table is close to the number of records in the table, the time taken to query with a sequential number is close to 3 seconds. If an index is created using a sequential number as the index field for the same query, it takes less than 1 second. Therefore, in the design of large databases, only reasonable selection of index fields can effectively improve the operational efficiency of the entire database.
 
3. Optimize data operations. In large databases, it is worth noting how to improve data operation efficiency. For example, every time you add a service to the database flow table, You must retrieve the flow number from the flow control table and add the value of the flow number to one. Under normal circumstances, the response speed of a single operation is normal, but when it is used for batch business processing, the speed will be significantly slowed down. After analysis, we found that the table should be locked every time the flow number value in the flow control table is added for a while, and the table is the core of the entire system operation, and may be locked by other processes during the operation, this slows down the entire transaction operation. The solution to this problem is to apply for a serial number in batches based on the total number of batch businesses and update the flow control tabulation to speed up batch business processing. Another example is table insertion optimization. For a large number of business processing, if you use a common Insert statement when inserting a database table, the speed will be very slow. The reason is that it takes a long time to insert an I/O operation to a table. After improvement, I/O operations can be performed after the pages are full in the buffer format such as Put statement to improve efficiency. When a large database table is deleted, the Delete statement is usually used directly. Although this statement can be used for small table operations, however, deletion of large tables may be slow or even fail due to large transactions. The solution is to remove the transaction, but the more effective method is to perform the Drop operation and then perform reconstruction.
 
4. Adjust database parameters. Adjusting database parameters is a process of accumulating experience and should be completed by experienced system administrators. Taking the Informix database as an example, too few locks will cause the lock table to fail; too few logical logs will cause the insertion of large tables to fail, all these problems should be adjusted according to the actual situation.
 
5. Necessary tools. In the whole database development and design process, you can first develop some small application tools, for example, the header file of the database table, the initialization of inserted data, the function encapsulation of data insertion, Error Tracking, or automatic display are automatically generated to improve the efficiency of database design and development.
 
6. Avoid long transactions. Deleting or inserting a single large table will lead to large transactions. The solution is to adjust the parameters or split the files during insertion. For a long transaction that consists of a series of small transaction sequence operations (such as the day-end transaction of the bank transaction system), a series of operations can complete the entire transaction, however, the disadvantage is that the entire transaction may be too large to be completed, or it may take too long to redo the transaction due to unexpected events. A better solution is to break down the entire transaction into several smaller transactions, and then the application controls the entire system process. In this way, if a transaction fails, you only need to redo the transaction, which can save time and avoid long transactions.
 
7. Advance as appropriate. With the rapid development of computer technology, database design must be proactive, not only to meet the current application requirements, but also to consider future business development, at the same time must be conducive to the expansion or increase of application system processing functions.

Compared with small and medium databases, the design and development of large databases are much more complex. Therefore, in the design and development process, in addition to following the database Paradigm Theory and increasing system consistency and integrity, distributed design should also be carried out according to specific situations in general, and the basic principles of centralized control and unified audit should be closely grasped to ensure the compact database design structure, balanced distribution, and rapid positioning. In terms of database operations, certain skills should be used to improve the execution efficiency of the entire application system, and appropriate advances should be taken to adapt to the changing application and system development requirements.

Turn: http://blog.csdn.net/creazyfrog/archive/2007/10/22/1836725.aspx

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.