Recommendations for database design and SQL optimization

Source: Internet
Author: User

1, in the program, to ensure that the implementation of the function on the basis of the minimum number of access to the database; Through the search parameters, minimize the number of access to the table, minimize the result set, thereby reducing the burden on the network, can separate operations as far as possible separate processing, improve each response speed In the Data window when using SQL, try to put the index used in the first column of the selection; the structure of the algorithm is as simple as possible; when querying, do not use wildcards such as the SELECT * from T1 statement, use a few columns to select a few columns such as: Select Col1,col2 from T1 , as far as possible, limit the number of result sets as possible: SELECT TOP col1,col2,col3 from T1, because in some cases the user does not need that much data. Rather than using database cursors in your app, cursors are a useful tool, but they require more overhead than regular, set-oriented SQL statements, and fetch data in a specific order.

2, avoid the use of incompatible data types. For example, float and int, char and varchar, binary, and varbinary are incompatible. Incompatible data types may make the optimizer unable to perform some optimizations that could otherwise have been performed. For example:

SELECT name from employee WHERE Salary > 60000

In this statement, such as the salary field is a money type, it is difficult for the optimizer to optimize it because 60000 is an integer number. We should convert an integer into a coin type when programming, rather than wait for a run-time conversion.

3. Try to avoid function or expression operations on the field in the Where clause, which will cause the engine to discard the full table scan using the index. such as:

SELECT * from T1 WHERE f1/2=100

should read:

SELECT * from T1 WHERE f1=100*2

SELECT * from RECORD WHERE SUBSTRING (card_no,1,4) = ' 5378 '

should read:

SELECT * from RECORD WHERE card_no like ' 5,378% '

SELECT Member_number, first_name, last_name from members

WHERE DATEDIFF (Yy,datofbirth,getdate ()) > 21

should read:

SELECT Member_number, first_name, last_name from members

Where dateOfBirth < DATEADD (Yy,-21,getdate ()) that is, any action on a column causes a table scan, which includes database functions, evaluation expressions, and so on, to move the operation to the right of the equals sign whenever possible.

4. Avoid operators such as! = or ">, is null," is NOT null, "in," not "and so on, because this makes the system unusable with indexes and can only search the data in the table directly . For example: The SELECT ID from the employee WHERE ID! = ' B% ' optimizer will not be able to determine the number of rows in the fatal row by index, so you need to search all rows of the table.

5. Use numeric fields as much as possible , and some developers and database managers like to design fields that contain numeric information as character types, which can degrade query and connection performance and increase storage overhead. This is because the engine is processing the query and connecting back to each character in a string by comparison, and for a digital type it only needs to be compared once.

6, reasonable use exists,not exists clause. is as follows:

1.SELECT SUM (T1. C1) from T1 where ((SELECT COUNT (*) from T2 where t2.c2=t1.c2>0)

2.SELECT SUM (T1. C1) from T1where EXISTS (SELECT * from T2 WHERE T2. C2=t1. C2)

Both produce the same result, but the latter is obviously more efficient than the former. Because the latter does not produce a large number of locked table scans or index scans.

If you want to verify that there is a record in the table, do not use COUNT (*) as inefficient and waste server resources. Can be replaced with exists. Such as:

IF (SELECT COUNT (*) from table_name WHERE column_name = ' xxx ')

Can be written as:

IF EXISTS (SELECT * FROM table_name WHERE column_name = ' xxx ')

It is often necessary to write a t_sql statement that compares a parent result set and a child result set to find out if there are records in the parent result set that are not in the child result set, such as:

1.SELECT A.hdr_key from Hdr_tbl a--tbl A means that TBL uses alias a instead

Where not EXISTS (SELECT * from dtl_tbl b WHERE a.hdr_key = B.hdr_key)

2.SELECT A.hdr_key from Hdr_tbl a

Left JOIN dtl_tbl B-a.hdr_key = B.hdr_key WHERE B.hdr_key is NULL

3.SELECT Hdr_key from Hdr_tbl

WHERE Hdr_key not in (SELECT Hdr_key from DTL_TBL)

Three kinds of writing can get the same correct results, but the efficiency is reduced in turn.

7, try to avoid in the indexed character data, the use of non-beginning letter search . This also makes the engine unusable with indexes.

See the following example:

SELECT * from T1 WHERE NAME like '%l% '

SELECT * from T1 WHERE substing (name,2,1) = ' L '

SELECT * from T1 WHERE NAME like ' l% '

Even though the name field is indexed, the first two queries are still unable to take advantage of the indexing to speed up the operation, and the engine has to perform the task by one-by-one operations on all tables. The third query can use an index to speed up operations.

8, sub-use connection conditions, in some cases, there may be more than one connection between the two tables, then in the WHERE clause in the connection condition complete write, it is possible to greatly improve the query speed .

Cases:

SELECT SUM (A.amount) from account a,card B WHERE a.card_no = b.card_no

SELECT SUM (A.amount) from account a,card B WHERE a.card_no = B.card_no and A.account_no=b.account_no

The second sentence will be much faster than the first sentence.

9. Eliminate sequential access to large table row data

Although there are indexes on all the check columns, some forms of where clauses force the optimizer to use sequential access. Such as:

SELECT * FROM Orders WHERE (customer_num=104 and order_num>1001) OR

order_num=1008

The workaround can use the set to avoid sequential access:

SELECT * FROM Orders WHERE customer_num=104 and order_num>1001

UNION

SELECT * FROM Orders WHERE order_num=1008

This enables the query to be processed using the index path.

10. Regular expressions to avoid difficulties

The LIKE keyword supports wildcard matching, which is technically called a regular expression. But this is a particularly time-consuming match. Example: SELECT * from the customer WHERE zipcode like "98_ _ _"

Even if an index is established on the ZipCode field, sequential scanning is also used in this case. Such as

Change the statement to select * from Customer where zipcode > "98000" to execute the query

will use the index to query, obviously will greatly increase the speed.

11. Use the view to speed up the query

Sorting a subset of tables and creating views can sometimes speed up queries. It helps to avoid multiple sorting

Operations, and in other ways simplifies the work of the optimizer. For example:

SELECT cust.name,rcvbles.balance,......other Columns

From Cust,rcvbles

WHERE cust.customer_id = rcvlbes.customer_id

and rcvblls.balance>0

and cust.postcode> "98000"

ORDER by Cust.name

If the query is to be executed multiple times and more than once, all unpaid customers can be found on a

View and sort by customer's name:

CREATE VIEW DBO. V_cust_rcvlbes

As

SELECT cust.name,rcvbles.balance,......other Columns

From Cust,rcvbles

WHERE cust.customer_id = rcvlbes.customer_id

and rcvblls.balance>0

ORDER by Cust.name

Then query in the view in the following way:

SELECT * from V_cust_rcvlbes

WHERE postcode> "98000"

The rows in the view are less than the rows in the primary table, and the physical order is the required order, reducing the disk

I/O, so the query effort can be greatly reduced.

12, you can use between do not use in

SELECT * from T1 WHERE ID in (10,11,12,13,14)

Change to:

SELECT * from T1 WHERE ID between and 14

Because in causes the system to not use the index, it can only search the data in the table directly.

13, distinct do not need GROUP by

SELECT OrderID from Details WHERE UnitPrice > Ten GROUP by OrderID

Can be changed to:

SELECT DISTINCT OrderID from Details WHERE UnitPrice > 10

14. Partial Utilization Index

1.SELECT EmployeeID, FirstName, LastName

From names

WHERE dept = ' Prod ' or city = ' Orlando ' or division = ' food '

2.SELECT EmployeeID, FirstName, LastName from names WHERE dept = ' PROD '

UNION All

SELECT EmployeeID, FirstName, LastName from names WHERE city = ' Orlando '

UNION All

SELECT EmployeeID, FirstName, LastName from names WHERE Division = ' Food '

If the Dept column has an index, query 2 can partially take advantage of the index, and query 1 cannot.

15, you can use union all do not use union

UNION all does not execute the SELECT DISTINCT function, which reduces a lot of unnecessary resources

16, do not write some do not do anything about the query

such as: SELECT COL1 from T1 WHERE 1=0

SELECT COL1 from T1 WHERE col1=1 and col1=2

This type of death code does not return any result sets, but it consumes system resources.

17, try not to use the SELECT INTO statement.

The SELECT inot statement causes the table to lock and prevent other users from accessing the table.

18. Force the query optimizer to use an index if necessary

SELECT * from T1 WHERE nextprocess = 1 and ProcessID in (8,32,45)

Change to:

SELECT * from T1 (INDEX = ix_processid) WHERE nextprocess = 1 and ProcessID in (8,32,45)

The query optimizer will forcibly use the index Ix_processid to execute the query.

19, although the update, DELETE statement is basically fixed, but also to the UPDATE statement to the point of recommendation:

A) Try not to modify the primary key field.

b) When modifying a varchar field, try to replace it with the value of the same length content.

c) Minimize the update operation for the table that contains the update trigger.

d) Avoid update columns that will be copied to other databases.

e) Avoid the update of columns with many indexes.

f) Avoid the column of the update in the WHERE clause condition.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.