Common SQL Server specifications

Last Update:2016-03-08 Source: Internet

Author: User

Tags exception handling joins server memory

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Common field type selection 1. Varchar/nvarchar data types are recommended for character types 2. Money data type is recommended for monetary value 3. Scientific counting recommended numeric data type 4. Self-growth logo recommended to use the bigint data type (a large amount of data, with the type of int will not fit, then the transformation will be troublesome) 5. Time type recommended for datetime data type 6. Prohibit the use of text, ntext, image old data type 7. Prohibit the use of XML data types, varchar (max), nvarchar (max)Constraints and indexes

Each table must have a primary key

• Each table must have a primary key to enforce entity integrity

• Single table can have only one primary key (not allowed for null and duplicate data)

• Use single-field primary keys whenever possible

Foreign keys are not allowed

• Foreign keys increase the complexity of table structure changes and data migration • Foreign keys have an impact on insert, update performance, need to check for primary foreign KEY constraints • Data integrity by program control

Null property

• The newly added table, all fields are forbidden null ( why doesn't the new table allow Nulls? allowing null values increases the complexity of the application. You have to add specific logic code to prevent the occurrence of various unexpected bugs The three-valued logic, all queries of the equals sign ("=") must increase the isnull's judgment. Null=null, Null!=null, not (Null=null), not (null!=null) are unknown, not trueFor example, if the data in the table: you want to find all the data except the name equals AA, then you inadvertently use the

SELECT * from nulltest WHERE name<> ' AA '

It turns out that it's not the same as expected, in fact it only finds Name=bb and doesn't find name=null data records.

So how do we find all the data except the name equals AA, only with the IsNull function?

SELECT * from Nulltest WHERE ISNULL (name,1) <> ' AA '

But you may not know IsNull can cause serious performance bottlenecks., so it is often best to restrict user input at the application level, ensuring that users enter valid data and then query. • Old table New Add field, need to allow null (avoid full table data Update, long-term lock cause blocking)(This is mainly considering the improvement of the previous table) index design Guidelines • You should create indexes on columns that are used frequently in the WHERE clause • You should create indexes on columns that are frequently used for joining tables • You should create an index on the columns that are used frequently in the ORDER by clause • You should not create indexes on small tables (tables that use only a few pages) because a full table scan operation may be faster than a query executed with an index • The number of single-table indexes is no more than 6 • Do not give low-selectivity characters Jianjian single-column indexes • Take advantage of unique constraints • The index contains no more than 5 fields (including include columns)Do not create a single-column index for a low-selectivity field SQL Server is required for the selectivity of the index field, and if the selectivity is too low, SQL Server discards the field that does not fit to create the index: Gender, 0/1, True/false The field that is appropriate to create the index: ORDERID, Full use of unique indexes such as UID

A unique index provides SQL Server with information that ensures that a column has absolutely no duplicate values, and when Query Analyzer finds a record through a unique index, it exits immediately without continuing to look up the index

Table index number not exceeding 6

There are no more than 6 table indexes (this rule was developed by Ctrip's DBA after a trial ...). • Index speeds up query speed, but affects write performance • Index of a table should be combined with all the SQL that is associated with the table. The principle of combining indexes is that the better the filter is, the better the top of the index and the more indexes will not only increase the compilation time, but also affect the database select the best execution plan SQL query · prohibit complex operations in the database • Disable the use of select * • Prohibit the use of functions or calculations on indexed columns • Disable the use of Cursors • Disable the use of triggers • Prohibit specifying indexes in queries • The variable/parameter/associated field type must match the field type • Parametric queries • Limit number of joins • Limit the length of SQL statements and the number of in clauses • Avoid large transaction operations as much as possible • Turn off affected row count information to return • Nolock must be added to the SELECT statement unless necessary • Replace union with Union all • Querying large amounts of data using page or top • Recursive query hierarchy limits NOT exists instead of in • Temporary tables and table variables • Use local variables to select the Moderate execution plan • Avoid using the OR operator as much as possible • Increased transaction exception handling mechanism • Output columns use two-segment naming format

Prohibit complex operations in the database XML parsing • String similarity comparison • String search (Charindex) • Complex operations are done on the terminal prohibit use of SELECT * reduce memory consumption and network bandwidth • Give the query optimizer the opportunity to read the required columns from the index · Error in query when table structure changes

Prohibit the use of functions or calculations on indexed columns

In the WHERE clause, if the index is part of a function, the optimizer will no longer use the index for full table scanning

Assuming an index is built on the field Col1, the following scenarios will not work with the index:

Abs[col1]=1

[Col1]+1>9

Let me give you an example.

Queries like the above will not be able to use the Printtime index on the o_orderprocess table, so we apply the query SQL as shown below

Prohibit the use of functions or calculations on indexed columns

Assuming an index is built on the field Col1, the following scenarios will be available to the index:

[col1]=3.14

[col1]>100

[Col1] Between 0 and 99

[Col1] Like ' abc% '

[Col1] In (2,3,5,7)

Index problem for like Query 1. [Col1] Like "abc%"--index seek this uses the index query 2. [Col1] Like "%abc%"--index scan and this is not used in index query 3. [Col1] Like "%abc"--index scan This is also not used for index queries I think from the top and three examples, you should understand that it is best not to use fuzzy matching in front of like conditions, otherwise you will not use index query. Disable cursors • Relational databases are suitable for set operations, that is, the set of result sets determined by the WHERE clause and the selection column, which is a way to provide a non-collection operation. In general, cursors implement functionality that is often equivalent to a loop implementation of the client. • Cursors are a very large drain on database resources (especially memory and lock resources) by placing the result set in server memory and by looping through a single processing record. (plus the cursor is really more complex, it is not good to use, try to use it sparingly) prohibit the use of triggers

The trigger is opaque to the application (the application level does not know when the trigger will be triggered, nor does it occur, and it feels inexplicable ...). ）

Prohibit specifying indexes in queries

With (INDEX=XXX) (in the query we specify the index is generally used with (INDEX=XXX))

• As the data changes, the index performance specified by the query statement may not be optimal • The index should be transparent to the application, such as the deletion of the specified index will result in query error, not conducive to troubleshooting • New index cannot be used immediately, must be published code to take effect variable/parameter/ The associated field type must be the same as the field type (which I was not concerned about before)

Avoid the type conversion of additional CPU consumption, caused by the large table scan is particularly serious

Look at the above two pictures, I think I do not have to explain the explanation, everyone should have made it clear.

If the database field type is varchar, the best type in the application is specified as ansistring and its length is explicitly specified

If the database field type is char, the best type in the application is specified as Ansistringfixedlength and its length is explicitly specified

If the database field type is nvarchar, the best type in the application is specified as a string and its length is explicitly specified

parameterized queries

You can parameterize query SQL in the following ways:

sp_executesql prepared queries stored procedures use a diagram to illustrate, haha. Limit join Number • The number of table joins for a single SQL statement cannot exceed 5 • The number of joins can cause the query parser to go wrong execution plan • Excessive joins consume a large amount of constraints in the compile execution plan • The number of conditions in the IN clause The clause includes a very large number of values (thousands) that may consume resources and return an error 8623 or 8632, requiring that the number of conditions in the IN clause be limited to 100 to avoid large transaction operations • Start transactions only when data needs to be updated, reduce resource lock holding time • Increase transaction exception capture preprocessing mechanisms · Prohibit the use of distributed transactions on the database diagram to illustrate that we should not after the 1000 rows of data are updated to commit tran, you think you are updating the 1000 rows of data when the exclusive resources cause other transactions can not be processed. Closes the affected row count information back

Display setting set NOCOUNT on in SQL statements, cancellation of affected row count information returned, reduced network traffic

Unless necessary SELECT statements must be added Nolock

Unless necessary, try to have all the SELECT statements added Nolock

Specifies that dirty reads are allowed. No shared locks are published to prevent other transactions from modifying the data read by the current transaction, and exclusive locks set by other transactions do not prevent the current transaction from reading the locked data. Allowing dirty reads can result in more concurrent operations, but at the cost of reading data modifications that are later rolled back by other transactions. This may cause errors in your transaction, display data that has never been submitted to the user, or cause the user to see the record two times (or not see the record at all)

Replace union with UNION all

Union will reorder the SQL result set, increasing CPU, memory and other consumption

Querying large amounts of data using page or top

Reasonable limit record return number, avoid IO, network bandwidth bottleneck

Recursive query hierarchy limit

Use Maxrecursion to prevent unreasonable recursive CTE from entering an infinite loop

Temporary tables and table variables

Use local variables to select moderation Execution plan

In a stored procedure or query, a table with a very uneven data distribution is accessed, which often causes the stored procedure or query to use suboptimal and even poor execution plans, causing high CPU and large IO read problems, and using local variables to prevent the execution plan from going wrong.

In the way of local variables, SQL does not know the value of this local variable at compile time, when SQL will "guess" a return value based on the general distribution of the data in the table. The resulting plan is the same regardless of the value of the variable that the user is substituting for when invoking the stored procedure or statement. Such a plan will generally be more moderate, not necessarily the best plan, but generally not the worst plan

L if the local variable in the query uses the inequality operator, the Query Analyzer uses a simple 30% calculation to estimate
Estimated rows = (total rows *)/100 L If the equality operator is used in the local variables in the query, the Query Analyzer uses: Accuracy * table records totals to estimate
Estimated rows = Density * Total Rows

Try to avoid using the OR operator

For the OR operator, a full table scan is usually used, considering that the decomposition into multiple queries is implemented with Union/union all, which confirms that the query can go to the index and return fewer result sets

Increase the transaction exception handling mechanism application to do unexpected processing, timely do rollback. Set connection Properties "set XACT_ABORT on" output column using two-segment naming format

Two-segment naming format: Table name. Field Name

A tsql with a join relationship, the field must indicate which table the field belongs to, or the program compatibility error of ambiguous column name may occur after a future table structure change

Architecture Design · read/write separation Schema Decoupling • Data life cycleRead/write separation, design and read-write separation, even if read and write the same library, to facilitate rapid expansion of reading features read into real-time read and can be deferred to read the corresponding to the writing and reading library • Read/write separation should consider automatically switch to write-side schema decoupling when read is not available

Prohibit cross-Library join

Data life cycle

Archiving of large-scale periodic sub-libraries based on the frequency of use of data

Physical separation of the main library/archive library

Table of log type should be partitioned or divided into tables

For large tables to be partitioned, partition operations divide tables and indexes across multiple partitions, and partition switching enables fast new and old partition substitution, faster data cleanup and significantly reduced IO resource consumption

Frequently written tables that require partitioning or sub-tables

Self-growth with latch Lock

The latch is SQL Server's own internal application and control, the user has no way to intervene, to ensure the consistency of the memory data structure, lock level is page-level lock

Common SQL Server specifications

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More