Experience on optimization of database design

Last Update:2014-09-12 Source: Internet

Author: User

Tags table definition

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Caution with cursors (cursor)

Cursors provide a means of progressive scanning in a particular set, typically using cursors to iterate through the data on a row-by-line basis, with different operations depending on the criteria for extracting the data. And for the multi-table and large table defined in the cursor (large data collection) loop is easy to make the program into a long wait or even panic, the author in a city "Housing Provident Fund Management System" for the day end account rolling plot no interest processing, Processing a 100,000-account cursor causes the program to enter an indefinite wait (48 hours after measurement) (Hardware environment: alpha/4000 128MB RAM, SCO Unix, Sybase 11.0). After modifying the program and using the UPDATE statement, the process can be completed within 20 minutes. Examples are as follows:

Declare mycursor Cursor FOR select Count-no from Count

Open MyCursor

Fetch mycursor into @vcount-no

while (@ @sqlstatus =0)

Begin

If @vcount-no=′′ Condition 1

Action 1

If @vcount-no=′′ Condition 2

Action 2

...

Fetch mycursor into @vcount-no

End

...

Switch

Update COUNT set operation 1 for condition 1

Update COUNT set operation 2 for condition 2

...

In some cases where cursors must be used, consider moving the qualifying rows of data into a temporary table and then manipulating the temporary table definition cursors, which can significantly improve performance. In the background programming of "Telecom toll system" in a certain city, the author makes a cursor operation on a table (30 rows of data in 30,000 rows) (Hardware environment: PC server, pⅱ266 64MB RAM, Windows NT4.0 MS SQL Server 6.5).

Examples are as follows:

Create #tmp/* Define TEMP Table */

(Field 1

Field 2

... )

Insert into #tmp select * from total where condition

Declare mycursor Cursor FOR SELECT * FROM #tmp/* define cursor on temporal table */

...

How to use indexing (index)

Creating an index typically has two purposes: maintaining the uniqueness of the indexed columns and providing a strategy for quickly accessing the data in the table. Large databases have two indexes, cluster index and non-clustered index, a table without a clustered index is stored by the heap structure, all the data is added to the end of the table, and the cluster index table, whose data are physically stored in the order of the cluster index key, a table only allows one cluster index, so according to the B-tree structure, It can be understood that adding any index can increase the speed of query-by-index columns, but at the same time reduces the performance of INSERT, update, and delete operations, especially when the fill factor (Factor) is large. So when you make frequent insertions, updates, and deletions to tables with more indexes, you should set up a smaller fill factor to create tables and indexes so that you can leave more free space in each data page, reducing page splits and re-organization.

consistency and completeness of data

To ensure database consistency and integrity, designers tend to design too many inter-table associations (Relation) to reduce data redundancy as much as possible. The Inter-table Association is a mandatory measure that, when established, inserts, updates, and deletes on the parent table and child tables (children table) occupy the overhead of the system, and it is best not to use the Identify attribute field as the primary key to associate with the child table. If data redundancy is low, the integrity of the data is easily guaranteed, but the operation of the Inter-table connection query is increased. In order to improve the response time of the system, reasonable data redundancy is necessary. Using rules (rule) and Constraints (Check) to prevent system operators from mistakenly entering errors that result in data is another common tool for designers, but unnecessary rules and constraints can also take up unnecessary overhead for the system, and it is important to note that constraints validate data more quickly than rules. All of these, designers in the design phase should be based on the type of system operation, frequency to be balanced consideration.

the trap of a transaction

A transaction is a set of operations that is done at once. Although these operations are a single operation, SQL Server can ensure that the set of operations is either complete or not at all. It is this feature of large-scale database, which makes the integrality of data be guaranteed greatly.

As is well known, SQL Server provides implicit transaction control for each individual SQL statement, allowing the data operations of each DML to be fully committed or rolled back, but SQL Server also provides explicit transaction control statements such as:

Begin TRANSACTION Start a transaction

Commit TRANSACTION Commit a transaction

ROLLBACK TRANSACTION rolling back a transaction

Transactions can be nested, and the transaction nesting level of the connection can be retrieved through the global variable @ @trancount. It is important to note that each display or implied object starts with a variable of 1, the commit of each transaction makes the variable minus 1, and the rollback of each transaction causes the variable to be 0, and the physical data is written to disk only if the variable is a 0 o'clock transaction commit (the last commit statement).

Selection of data types

The reasonable choice of data type has a great influence on the performance and operation of the database, and there are a lot of books about this aspect, the author mainly introduces several experiences here:

1. The Identify field should not be associated with other tables as the primary key of the table, which will affect the data migration of the table.

2. The Text and image fields are pointer-type data that is primarily used to store binary large objects (BLOBs). This type of data is slower to operate than other data types, so avoid using it.

3. The advantage of date type fields is that there are numerous date function support, so it is very simple to compare and subtract dates. However, the query operation by date as a condition also uses functions, which are much slower than other data types, because the server cannot use advanced performance policies to optimize queries and only table scans can traverse each row when the function is used as the condition of the query.

Experience on optimization of database design

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More