Some principles needing attention in design of large database

Source: Internet
Author: User
Keywords Network programming Database synthesis
Tags .mall analysis application application system based client cost data

A good database product is not equal to have a good application system, if can not design a reasonable database model, not only will increase the client and Server section program programming and maintenance of the difficulty, and will affect the actual performance of the system. Generally speaking, in an MIS system analysis, design, test and trial run stage, because the data quantity is small, designers and testers often only notice the realization of the function, but it is difficult to notice the weak performance, until the system put into actual operation for some time, only to find that the performance of the system is decreasing, At this point to consider the improvement of system performance will cost more human and material resources, and the entire system also inevitably formed a patching project. Based on the experience of designing and using database for many years, the author puts forward some design guidelines for the reference of colleagues.

Named specifications

Different database products have different requirements on the naming of objects, therefore, the naming of various objects in the database, the code of the background program should be in a case-sensitive form, a variety of objects named length not more than 30 characters, so that the application system to adapt to different databases.

The cautious use of cursors (Cursor)

Cursors provide a step-by-step scan of a particular set, typically using cursors to traverse data line by row, and to perform different operations based on different conditions of the extracted data. Especially for multiple tables and large table-defined cursors (large data set) loops are easy to get the program into a long wait and even panic, the author of a city "Housing Provident Fund Management System" in the day end account rolling disbursements interest processing, a 100,000-account cursor processing causes the program to enter an indefinite special ( After the calculation takes 48 hours to complete) (Hardware environment: alpha/4000 128mram,sco Unix, Sybase 11.0), after different conditions to use different update statements can be completed within 20 minutes. Examples are as follows:

Declare mycursor Cursor FOR select Count_no from Count

Open MyCursor

Fetch MyCursor into @vcount_no

while (@ @sqlstatus =0)

Begin

If @vcount_no = ' Condition 1

Operation 1

If @vcount_no = ' Condition 2

Operation 2

......

Fetch MyCursor into @vcount_no

End

......

......

To

Update COUNT set operation 1 for condition 1

Update COUNT set operation 2 for condition 2

......

......

In some cases, it is sometimes necessary to use a cursor, you can also consider the qualifying data rows into a temporary table, and then the temporary table to define cursors to operate, can be significantly improved performance. The author in a certain city, "Telecommunication toll system" Database background program design, to a table (30,000 lines of eligible 30-line data) cursor operation (Hardware environment: PC Server, PII266 64Mram, NT4.0 mobile phone 6.5). Examples are as follows:

Create #tmp/* Define temporary table * *

(Field 1

Field 2

......

)

Insert into #tmp select * from Total where

Condition/* Total 30,000 lines meet the conditions only dozens of lines * *

Declare mycursor Cursor FOR SELECT * FROM #tmp

/* Define cursor on temporary table/*

......

Guidelines for using Indexes (index)

Creating an index typically has the following two purposes: maintaining the uniqueness of indexed columns and providing fast access to the data in the table. Large databases have two indexes--clustered and nonclustered--and a table without a clustered index stores data by heap structure. All of the data is added to the end of the table, and the table with the clustered index, whose data is physically stored in the order of the cluster key, is allowed to have only one cluster index, so according to the B-tree structure, You can understand that adding any index increases the speed of querying by indexed columns, but reduces the performance of INSERT, update, and delete operations, especially if the fill factor (fill Factor) is larger. So the table with more indexes frequently inserts, updates, deletes operation, the table and index should set a small fill factor, in order to leave more free space in each data page, reduce page segmentation and the work of the Organization.

Consistency and integrity of data

To ensure consistency and completeness of the database, designers tend to design too many table associations (relation) to reduce data redundancy as much as possible. Inter-table Association is a mandatory measure that, when established, inserts, updates, and deletes for the parent table (parent table) and child table (children table) will occupy the system overhead, and it is best not to associate the Identify property field as a primary key with the child table. If the data redundancy is low, the integrity of the data is easy to be guaranteed, but it increases the operation of the connection query between tables, so it is necessary to improve the response time of the system. Using Rules and Constraints (Check) to prevent system operators from accidentally entering errors that cause data is another common tool for designers, however, unnecessary rules and constraints can also take up unnecessary overhead of the system, and it is important to note that constraints are faster than rules for validating data. All of these, designers in the design phase should be based on the type of system operation, frequency of balanced consideration.

The trap of the business

A transaction is a set of actions that are completed at once. Although these operations are a single operation, SQL Server can ensure that the set of operations is either complete or not done at all. It is this characteristic of large database that makes the integrality of data be guaranteed greatly.

As we all know, SQL Server provides implicit transaction control for each independent SQL statement, allowing the data operations of each DML to be fully committed or rolled back, but SQL Server also provides explicit transaction control statements

Begin TRANSACTION Start a transaction

Commit TRANSACTION Commit a transaction

ROLLBACK TRANSACTION rollback a transaction

Transactions can be nested and can be retrieved through the global variable @ @trancount to a connected transaction nesting level. That requires special attention and makes it easy for programmers to make mistakes, each display or implied thing starts with the variable plus 1, the commit of each transaction causes the variable to be reduced by 1, and the rollback of each transaction causes the variable to be 0, and only if the variable is a 0 o'clock transaction commit (when the last commit statement), The physical data is then written to disk.

Database Performance Tuning

In the case of computer hardware configuration and network design, the factors that affect the performance of the application system are database performance and client programming. Most database designers use two-step database design: First, logical design, and then physical design. The database logic design eliminates all redundant data, improves the data throughput speed, guarantees the data integrality, and clearly expresses the relationship between the data elements. and for the association query between multiple tables (especially large data tables), its performance will be reduced, but also improve the programming of the client program, so the physical design needs to compromise, according to business rules, determine the size of the associated table data, access frequency of data items, It is necessary to improve the data redundancy design for the frequent association query of such data table.

Selection of data types

The reasonable choice of data type has a great influence on the performance and operation of the database, and there are a lot of books on this aspect, which mainly introduces some experience.

The Identify field should not be associated with other tables as the primary key of the table, which will affect the data migration of the table.

The Text and image fields are pointer-type data and are used primarily for storing binary large objects (BLOBs). The operation of this type of data is slower than other data types, so avoid using it.

The advantage of date-type fields is that there are many date function supports, so it is very simple to compare and subtract the date. However, a query operation that uses a date as a condition will also use a function, which is much slower than other data types, because the server cannot optimize the query with the advanced performance strategy and can only scan through each row with a function as a condition for the query.

For example, to query all records for 1998 years from DATA_TAB1, which has a date field named date.

Select * from DATA_TAB1 where DATEPART (yy,date) =1998

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.