Overview
How do you write SQL query statements that are relatively high performance? Two methods: Create a reasonable index; write efficient SQL statements
Fundamentals of Indexing
The index is divided into clustered and nonclustered indexes. A table can only create one clustered index and n nonclustered indexes, the origin of which is mainly due to the principle of indexing.
A table in a database whether you create an index without creating it, or whether you create that type of index, which is the same as the storage on your hard disk, create an index and do not create an index, or create a clustered index and a nonclustered index where is the difference?
the difference is that the data in the table exists in the form of memory. for a table that does not create an index, it is only a block of data when loaded into memory, and for a table with a clustered index, it forms a tree when it is loaded into memory, and the leaf node is the specific data; For a nonclustered indexed table (when there is no clustered index), when it loads memory, The real data is a separate block whose index will form an index tree, and the leaf node of the index data corresponds to the rowid of the real data block; For a table with a nonclustered index and a clustered index, when it is loaded into memory, it forms two trees, the specific data and the clustered index tree together, and the nonclustered index has a tree alone. At this point, note that the index of the clustered index is stored in its leaf node and then, through the index of the clustered index, the data is found in the clustered index tree. This is why a table can have only one clustered index, and the principle of multiple nonclustered indexes.
How do we know the performance of our SQL query statement execution?
Let's write a SQL query statement, how do we know if this SQL query is good or bad?
One way, as we all know, is executed directly in the SQL Server's query, and then its execution status bar gives us the corresponding result, such as: Execution time (seconds), another method, by executing SET STATISTICS IO on, opening resource consumption information, executing set Statistics time on, when the two functions are turned on, when we run the SQL statement again, the results will appear in the message view next to the result view, and one way is to use the SQL Server Profiler tool (tool-sql Server Profiler) to help us monitor the execution of executed SQL statements, which is more appropriate for scenarios in which we cannot extract SQL statements in our programs.
How do I create a reasonable index?
through the above method, we can know our SQL statement, for bad SQL, we want to try to make the corresponding changes, but, for the bad SQL statement, we in the optimization, we must know the problem is where it is, can not say that I was so changed, so, at this time, We need another tool to help us analyze, namely: Execution plan view (query-show Execution query plan), through execution plan, can know our SQL statement when executing query, use or not use index. Here's how to query in 3: Table Scan,index Scan,index seek.
Table Scan is a full-table scan, directly in the real data block for all scanning; Index Scan is a full scan on the index tree, if it is a clustered index, the corresponding data will be directly (the leaf is the specific data), if the non-clustered index, You need to get the corresponding clustered index key on the leaf, and then, based on the clustered index key, find the corresponding real data in the clustered index tree; The index scan is the search on the index tree, and the implementation of this method is related to its internal principle, because this is a tree, so, according to some algorithms (binary, etc.) Can be quickly targeted to specific data, this is also divided into aggregation and non-aggregation, here no longer repeat.
Through the execution plan, we can know that our SQL statements in that block do not use the index, and then we can change the SQL statement, we can also create the corresponding index, it is necessary to know, not to use the index on a certain block, if your data is very small, if you create an index very much, then, This might be the case: instead of a full table index block. So, when we optimize our SQL, we try to move closer to the existing index, to achieve the failure, then, depending on the amount of data, we see if we need to create the corresponding index.
how can I write efficient SQL query statements?
The above piece of content we know, our SQL statement should be as close as possible to the existing index, then, if you can use the existing index? You might say, we're just going to have to create some indexes. Each column is an index, so you can not use the index well, as you say, we do not consider the other, we think the SQL statement execution, the use of the index is fast, but, I want to say, In time you create an index on each column, nor do you write an SQL statement that uses index seek when it executes. Because the system can not find a reasonable index seek execution plan when analyzing your SQL statement, how to write our SQL statement, so that the system after analysis, the implementation plan to get the use of index seek? How to write our SQL statements to get the best execution plan?
1. Optimization of query columns
use the column in the table to select who, do not select *, unless you use each column, because select * and select all columns are the same
2, where query condition optimization
Do not operate on the condition columns, such as using functions on columns, columns for data type conversions, using the parameter operator (filed + ' AAA ') on columns, and using non-parametric conditions on columns (like '%a ', not......,or ...). And so on), which can greatly discount the implementation of the index, or simply lose
3. Other
Do not use cursors, you can query code cursors using the collection criteria, or validate the presence or absence of data using exists instead of COUNT (*)
Summary
The index is not good, there is no index is not well; The index is based on the specific amount of data, whether the column is to be indexed, according to the column in the query as the proportion of the conditional query; Good SQL statements not only move closer to the index, but also block unnecessary data and execution times.