SQL Server DBA tuning diary (1) -- optimization and principle of the number of records queried in large data volumes

Last Update:2014-04-07 Source: Internet

Author: User

Tags sql server management studio

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Problem description

In the production database, the data of one table is 1 billion level, and that of the other table is 10 billion level. The data of other tables is also quite large. I didn't know that these tables had such a large amount of data before I got started, so I used to use count (*) to calculate the number of records in the table. However, this execution was terrible. After running for more than 30 minutes, no results were returned. In the end, only the query was canceled. Later, another method was adopted to query the number of records. First, explain the solution and use the following SQL:

SELECT object_name (id) as TableName, indid, rows, rowcnt
FROM sys.sysindexes WHERE id = object_id ('TableName')
and indid in (0,1);

Problem simulation

Then I made a simulation and tried to analyze why there is such a big gap between using count (*) and querying the sysindexes view from a principle perspective.

Before we do the simulation, we must first get the test data. So I created a test table and inserted test data. Insert 100 million pieces of data here.

The statement to create a test table is as follows:

DROP TABLE count_Test;
CREATE TABLE count_Test
(
id bigint,
name VARCHAR (20),
phoneNo VARCHAR (11)
);

Due to the large amount of data inserted, we definitely cannot come by hand. So I wrote a stored procedure to insert 100 million data. In order to simulate the complexity of the data, the data is in the form of random strings. The stored procedure for inserting test data is as follows:

CREATE PROCEDURE pro_Count_Test
AS
BEGIN
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
SET NOCOUNT ON;
WITH Seq (id, name, phoneNo) AS
(
SELECT 1, cast ('13' + right ('000000000' + cast (cast (rand (checksum (newid ())) * 100000000 AS int)
AS varchar), 9) AS VARCHAR (20)),
cast ('name _' + right ('000000000' + cast (cast (rand (checksum (newid ())) * 100000000 AS int)
AS varchar), 9) AS VARCHAR (40))
UNION ALL
SELECT id + 1, cast ('13' + right ('000000000' + cast (cast (rand (checksum (newid ())) * 100000000 AS int)
AS varchar), 9) AS VARCHAR (20)),
cast ('name _' + right ('000000000' + cast (cast (rand (checksum (newid ())) * 100000000 AS int)
AS varchar), 9) AS VARCHAR (40))
FROM Seq
WHERE id <= 100000000
)
INSERT INTO count_Test (id, name, phoneNo)
SELECT id, name, phoneNo
FROM Seq
OPTION (MAXRECURSION 0)
SET STATISTICS IO OFF;
SET STATISTICS TIME OFF;
END

Then we execute this stored procedure and insert test data. SQL Server Management Studio records the operation time in the lower right corner of the output window. In order to be more intuitive, we manually wrote a statement to record the time, as follows:

DECLARE @d datetime
SET @ d = getdate ()
print 'Start execution of stored procedure ...'
EXEC pro_Count_Test;

SELECT [stored procedure execution time (ms)] = datediff (ms, @ d, getdate ())

Well, wait for 47 minutes and 29 seconds, the data insertion is completed, the statistics of the inserted data one, and the occupied data space two, we began to test the difference in efficiency between count (*) and sysindexes.

Figure 1 Insert 100 million rows of data statistics

Figure 2 Inserting 100 million rows of data takes up space

Use count (*) test without any index, the statement is as follows:

DECLARE @d datetime
SET @ d = getdate ()
SELECT COUNT (*) FROM count_Test;
SELECT [statement execution time (ms)] = datediff (ms, @ d, getdate ())

The memory usage rate soared to 96% during the test, which shows that the efficiency is extremely low. The test results took 1 minute and 42 seconds, three, we view the execution plan at this time, four. It can be clearly seen that a full table scan is taking place at this time, and most of the overhead is spent on this.

Figure 3 Execution time of count (*) without index

Figure 4 Count (*) execution plan without index

Use sysindexes test without any index, the statement is as follows:

DECLARE @d datetime
SET @ d = getdate ()
SELECT object_name (id) as TableName, indid, rows, rowcnt
FROM sys.sysindexes WHERE id = object_id ('count_Test')
and indid in (0,1);
SELECT [statement execution time (ms)] = datediff (ms, @ d, getdate ())

The test results took 450 milliseconds, five. We look at the execution plan at this time, six. You can see that the clustered index scan is taking place at this time, and all the overhead is here.

Figure 5 The execution time of using sysindexes without indexes

Figure 6 Use sysindexes to execute plans without indexes

After the test is completed without an index, we begin to test the indexed situation. First, we create a common index on the ID column. The statement is as follows:

CREATE INDEX idx_nor_count_test_id ON count_Test (id);

The memory usage rate and CPU utilization rate are quite high when building an ordinary index, reaching 97% on one reading. It took 34 minutes and 58 seconds to create a common index, the data file disk space was 6.71G (7046208K), and there was no change in the log file. Implementation plan seven:

Figure 7 Create an ordinary index execution plan

Use the count (*) test with the normal index. The statement is the same as the count (*) test without the index. The test results took 1 minute and 09 seconds, which is faster than not using the index. We look at the execution plan at this time, eight. It can be seen that the non-clustered index scan is taken at this time, and the overhead is mainly here.

Figure 8 Ordinary index using count (*) execution plan

Use the sysindexes test when there are ordinary indexes. The statement is the same as the sysindexes test without any indexes. The test results took 290 milliseconds, which was also less than when there was no index. We look at the execution plan at this time. Nine, we can see that the execution plan has not changed.

Figure 9: Common index using sysindexes execution plan

The normal index test is completed, now we test the clustered index. Delete the ordinary index and create a clustered index on the id column. The statement is as follows:

DROP INDEX idx_nor_count_test_id ON count_Test;
CREATE CLUSTERED INDEX idx_clu_count_test_id ON count_Test (id);

It took 25 minutes and 53 seconds to create a clustered index. The data file occupies 9.38G (9839680K).

In the case of a clustered index, the count (*) test is used, and the statement is the same as the count (*) test without any index. The test results took 4 minutes and 08 seconds, and we look at the execution plan at this time, ten. It can be seen that the clustered index is taken at this time, and the overhead is mainly spent here.

Figure 10 clustered index using count (*) test

In the case of clustered indexes, use sysindexes to test. The statement is the same as using sysindexes without any indexes. The test results took 790 milliseconds, we look at the execution plan at this time, eleven. The execution plan remains unchanged.

Figure 11 clustered index using sysindexes test

The clustered index test is complete, now we start testing non-clustered indexes. Delete the clustered index and create a non-clustered index, the statement is as follows:

DROP INDEX idx_clu_count_test_id ON count_Test.id;
CREATE NONCLUSTERED INDEX idx_nonclu_count_test ON count_Test (id);

It took 16 minutes and 37 seconds to delete the clustered index. It takes 40 minutes and 20 seconds to create a non-clustered index, and the data file takes up 9.38G (9839680K).

In the case of non-clustered indexes, use count (*) test. The statement is the same as the count (*) test without any index. The test results took 6 minutes and 59 seconds. We check the execution plan at this time, twelve. At this time, the non-clustered index is taken, and the overhead is mainly here.

Figure 12 Non-clustered index using count (*) test

In the case of non-clustered indexes, use sysindexes test. The statement is the same as using sysindexes without any indexes. The test result took 413 milliseconds, we look at the execution plan at this time, XIII. The execution plan remains unchanged.

Figure 13 Non-clustered index using sysindexes test

Then we do a combined test, including the case of ordinary indexes and clustered indexes, the case of ordinary indexes and non-clustered indexes, the case of ordinary indexes, clustered indexes and non-clustered indexes. First test the case of ordinary index and clustered index, we first delete the non-clustered index, and then create the ordinary index and clustered index, the statement is as follows:

DROP INDEX idx_nonclu_count_test ON count_Test.id;
CREATE INDEX idx_nor_count_test_id ON count_Test (id);
CREATE CLUSTERED INDEX idx_clu_count_test_id ON count_Test (id);

The deletion takes 1 second and the space remains unchanged. It takes 1:57:27 to create a clustered index and an ordinary index index, and the data file takes up 12.9G (13541440).

In the case of ordinary indexes and clustered indexes, use count (*) test. The statement is the same as the count (*) test without any index. The test results took 5 minutes and 27 seconds. We check the execution plan at this time, 14. At this time, the general index is used, and the overhead is mainly here.

Figure 14 clustered index, ordinary index using count (*) test

In the case of ordinary indexes and clustered indexes, use sysindexes to test. The statement is the same as using sysindexes without any indexes. The test result took 200 milliseconds. We look at the execution plan at this time. Fifteen, the execution plan remains unchanged.

Figure 15 clustered index, ordinary index using sysindexes test

Then test the situation with ordinary indexes and non-clustered indexes. We delete the clustered index and build a non-clustered index.

DROP INDEX idx_clu_count_test_id ON count_Test.id;
CREATE NONCLUSTERED INDEX idx_nonclu_count_test ON count_Test (id);

It takes 1:23:10 to delete ordinary indexes, 6 minutes and 50 seconds to create non-clustered indexes, and 12.9G of data file space.

In the case of ordinary indexes and non-clustered indexes, use count (*) test. The statement is the same as the count (*) test without any index. The test results took 52 seconds. We check the execution plan at this time, 16. At this time, the non-clustered index is taken, and the overhead is mainly here.

Figure 16 Non-clustered index, ordinary index using count (*) test

In the case of ordinary indexes and non-clustered indexes, use sysindexes test. The statement is the same as using sysindexes without any indexes. The test results took 203 milliseconds. We look at the execution plan at this time, 17. The execution plan remains unchanged.

Figure 17 Non-clustered index, ordinary index using sysindexes test

Finally, test the situation with ordinary index, clustered index and non-clustered index. We create an ordinary index, the statement is as follows:

CREATE NONCLUSTERED INDEX idx_nonclu_count_test ON count_Test (id);

It takes 1:11:21 to create a common index, and the data file takes up 16.3G (17116224KB).

In the case of ordinary index, clustered index and non-clustered index, use count (*) test. The statement is the same as the count (*) test without any index. The test results took 2 minutes and 51 seconds. We check the execution plan at this time, 18. At this time, the non-clustered index is taken, and the overhead is mainly here.

Figure 18 Common index, clustered index, non-clustered index using count (*) test

In the case of ordinary indexes, clustered indexes and non-clustered indexes, use sysindexes to test. The statement is the same as using sysindexes without any indexes. The test results took 203 milliseconds, we look at the execution plan at this time, nineteen. The execution plan remains unchanged.

Figure 19 Common indexes, clustered indexes, non-clustered indexes using sysindexes test

Joining indid greater than 1 takes 86 milliseconds, and execution plan is twenty; joining indid equals 1 takes 23 milliseconds, and execution plan is twenty-one. After all the tests are completed, the data files and log files occupy space 22.

Figure 20. Add indid greater than 1 execution time

Figure 21 The execution time when adding indid equals 1

Figure 22 After all tests are completed, data files and log files occupy space

We can see that the above methods are simply worlds apart in terms of efficiency. count (*) is slower no matter what the index is, and sysindexes is relatively faster.

Principle analysis

Ok, then I try to analyze why there is such a big gap between count (*) and sysindexes. First of all, I checked the help document, which introduced sys.indexes like this: Contains one row for each index and table in the current database. XML indexes are not supported in this view. Partitioned tables and indexes are not fully supported in this view; usethe sys.indexes catalog view instead. (Each index and table in the current database corresponds to a row. This view does not support XML indexes. This view does not fully support partitioned tables and indexes; please use the sys.indexes catalog view instead). In the MS SSQL database, each data table has at least one record in the sys.sysindexes system table. The rows or rowcnt fields in the record will periodically record the total number of records in the table. Please note that it is timed, not real-time, which shows that the total number of records obtained by this method is not an accurate value, because MS SQL does not update the value of this field in real time, but regularly updates, but from a practical perspective The difference between the value and the exact value is not large. If you want to quickly and roughly estimate the total number of records in the table, it is recommended that you use this method. If you want to view the number of real-time records, you can first execute DBCC UpdateUSAGE (DatabaseName, [TABLENAME]) WITH ROW_COUNTS to force the value of this field to be updated, and then use the SQL to query, so the value obtained is the real-time number of records.

On the basis of CBO, the count (*) statistical record number is like this: when querying the data, if the data is obtained, the corresponding record number is increased by 1 until the total number of records is returned. In the absence of an index, count (*) is Table Full Scan, which is a full table scan. For tables with a large amount of data, the full table scan speed must be slow, which is beyond doubt. If there is an index, then INDEX SCAN will be used, which is relatively fast. If you use count (*) to count the number of records and want to make the time to return the records shorter, we can create a clustered index on the table. Ordinary indexes can be built on multiple fields, but only one clustered index can be established in a table. Obviously we cannot use clustered indexes lightly. How to understand the clustered index? We can think of the clustered index as the pinyin index of the dictionary, so that the speed of finding words will be much faster. The question arises, if you build a clustered index, when do you go indexing, and when do you not go? If a single SELECT count (*) statement without a WHERE condition wants to use an index, the following two conditions must be met. The first is CBO, and the second is a column with NOT NULL attribute. If there are only index columns in the WHERE condition, without using some database built-in functions or other connection conditions, the index will generally be used. There is another question, why is the clustered index fast? The index is described by a balanced tree structure. The leaf node of the clustered index is the final data node, while the leaf node of the non-clustered index is still the index node, but it has a pointer to the final data. In the case of a clustered index, the leaf nodes of the non-clustered index store the keys of the clustered index. In the absence of a clustered index, a bookmark is stored, and the structure is: File ID: Page ID: Row ID. Therefore, when a table has a clustered index, the query speed will become very fast. In summary, in the absence of an index, count (*) takes a full table scan, which is slow.

Now the question is coming again, why is it faster to use sysindexes? The index exists for retrieval, which means that the index is not necessary for a table. The table index is composed of multiple pages, and these pages together form a tree structure, which is what we usually call a B-tree (balanced tree). First, let's look at the components of the index in the following table: the root pole node, root, which points to the other two This page logically divides the records of a table into non-leaf nodes Non-Leaf Level (branch), which points to a smaller leaf node Leaf Level (leaf). The root node, non-leaf node, and leaf node are all located in the index page, collectively referred to as the index leaf node, which belongs to the index page. These "branch" and "leaf" eventually point to the data page Page. The leaf between the root-level node and the leaf-level node is also called the data intermediate page. The root node corresponds to the root field of the sysindexes table, which records the physical location (ie, pointer) of the non-leaf node; the non-leaf node is located between the root node and the leaf node, and records the pointer to the leaf node; and the leaf node Then finally point to the data page, this is the last B tree. In sysindexes, we need to pay attention to the root field and the indid field. Let's take a look at the explanation of these two fields in the official document, twenty-three:

Figure 23 The official document explains the root field and indid field

From this we know that the index ID of 0 means heap, that is, a full table scan done without index; 1 means clustered index, greater than 1 means non-clustered index. The root field is not used during the full table scan, but only when there is an index. In a clustered index, the data page where the data is located is at the leaf level, and the index page where the index data is located is at the non-leaf level. Because the records are sorted by the clustered index key value, that is, the index key value of the clustered index is the specific data page. To access a table with a clustered index, the steps are as follows: First, the INDID value is queried in the sysindexes table, indicating that the clustered cable is established in the table; then from the root, the value closest to 1 is located in the non-leaf node, that is The branch node, and then find that it is located on the nth page of the leaf-level page; search for an entry with a value of 1 under the n-th page of the leaf-level page, and this entry is the data record itself; return the record to the client. Similarly, we query how many records there are in a certain table. The deletion condition we use is indid in (0,1), which means that we can search both ordinary tables (here, tables without a clustered index) and tables with a clustered index. To. Since sysindexes records the number of records in each table, whether the table is an ordinary table or a table with a clustered index, you can quickly return the results. As mentioned above, this value is not necessarily accurate. As for whether you want to obtain the actual number of records, or the initial number of records, it depends on your needs to obtain the number of records.

In summary, the reason why count (*) is slow without an index is a full table scan, and the reason why sysindexes is fast is to get the number of records directly from this view.

To put it aside, when inserting data, at the beginning I used a WHILE loop to insert 1 billion pieces of data. After waiting for more than two hours, I had to stop inserting and switched to CTE to insert the data. The efficiency of CTE inserting data is very high, the size of the data file increases at a rate of nearly 2M / s, but due to the large amount of data, it has to stop and change 1 billion to 10 million. It took 4 minutes and 52 seconds to insert 10 million data, the data file occupies 470M of disk space, and the log file occupies 2.3G of disk space, but the effect is not seen when doing statistical records, so it is changed to insert 100 million data. It took 47 minutes and 29 seconds to insert 100 million pieces of data, with data files occupying 4.54G of disk space and log files occupying 33.28G of disk space. From the data volume level of the inserted data, we know that for each order of magnitude, the time for inserting data will increase exponentially. How many times will be affected by many factors, such as the system idle rate, the machine CPU and IO load, and the occupied row of the inserted data Whether the space is consistent, etc. There is also a question to be understood here, that is why the CTE method is so fast? First of all, we understand the CTE. Common table expression (Common Table Expression) is a feature introduced after SQL SERVER 2005 version. CTE can be seen as a temporary result set, which can be referenced multiple times in the next SELECT, INSERT, UPDATE, DELETE, and MERGE statements. Using common expressions can make the statement more clear and concise. The insertion example in this article uses a CTE recursive query. The principle of CTE recursive query is this: The first step is to split the CTE expression into "anchor member" and "recursive member"; the second step is to run the anchor member and execute to create the first result set R0; the third Step, when running recursive members, use the previous result set as input (Ri), and Ri + 1 as output; Step 4, repeat the third step until the empty set is returned; Step 5, return the result set, through UNION ALL Combine the results of R0 to Rn. Readers familiar with programming know that recursion is also extremely efficient in programming. Similarly, inserting data after CTE uses recursion will become quite high. As can be seen from the growth rate of the data file, the data file growth before using CTE increases at a rate of several K per second, and after using CTE, the data file is nearly 2M per The rate of growth in seconds. After figuring out why CTE is so fast, here are some tips for clearing log files. After we use DROP TABLE count_Test, the space of data files and log files will not be really emptied. At this time, if we execute DBCC SHRINKDATABASE (db_test_wgb) (Note: db_test_wgb is the database name), you will notice that the number of data files and log files Ten G at a time becomes a few M. This is similar to the SHRINK TABLE in Oracle. Here we must emphasize that you should not execute this statement in the production library, otherwise you will regret it! Remember!

As a final note, this article refers to these two articles by Senior Jiang Min. Do software developers really understand SQL indexes (clustered indexes) and software developers really understand SQL indexes (index principles)? Also refer to Senior Song Yunjian ’s Article: Advanced T-SQL Query-Detailed Explanation of Common Table Expressions (CTE). If you want to understand the principle of indexing, it is strongly recommended to read this article by Senior Jiang Min: Do software developers really understand SQL indexing (indexing principle). For what is IAM, readers can take a look at Microsoft's official documentation to manage the space used by objects.

My email: wgbno27@gmail.com
Sina Weibo: @jutdb
Public platform: JustOracle (No .: justoracle)
Database technical exchange group: 336882565 (verify when adding a group From CSDN XXX, XXX represents your CSDN user name)
All is well
April 6, 2014
By Robin Wen

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More