Memory cost for SQL Server to execute some statements

Source: Internet
Author: User
Tags sql server query
Document directory
  • 1.1 perform full table selection or low-selectivity Selection
  • 1.2 perform highly selective Selection
  • 2.1 perform full table selection or low-selectivity Selection
  • 2.2 perform highly selective Selection
  • 3.1 perform full table selection or low-selectivity Selection
  • 3.2 perform highly selective Selection
  • 4.1 perform full table selection or low-selectivity Selection
  • 4.2 perform highly selective Selection
  • 5.1 perform full table selection or low-selectivity Selection
  • 5.2 perform highly selective Selection
  • 6.1 perform full table selection or low-selectivity Selection
  • 6.2 perform highly selective Selection
As we all know, the performance criterion for SQL Server to execute SQL statements is the size of IO reads. Without violating this principle, this article also analyzes the memory changes of SQL Server during execution of some SQL statements.

First, briefly describe the memory usage characteristics of SQL Server. In addition to programs (SQL Server engine), the memory occupied by SQL Server mainly includes cached data (Buffer) and Execution Plan (Cache ). SQL Server stores data on 8 KB pages. This is the same size as the storage page of SQL Server data on the disk. When SQL Server executes an SQL statement, if the required data is already in its memory, it directly reads the data from the memory buffer and performs necessary operations and then outputs the execution result. If the data is not in the memory, first read the data from the disk into the memory Buffer. In general, the I/O logic reads in SQL Performance Indicators correspond to the number of pages read from the memory buffer, while the number of I/O physical reads corresponds to the number of pages read from the disk.

Note: The following tests can also be performed on a development and testing server shared by multiple users, because the memory occupied by a table can be seen separately. However, for convenience, I conducted this experiment on a separate database that confirmed no other concurrent tasks, therefore, the memory changes are caused by every SQL statement executed.

Let's first look at a simple example. Create the following table:

Create Table P_User

(UserMobileStatus int not null,

Inclueno int not null,

LastOpTime DateTime Not NULL

)

Insert certain data to the table:

Declare @ I int

Set @ I = 28000

WHILE @ I <29000

BEGIN

Insert Into P_User

Select @ I % 2, @ I, GetUTCDate ()

Set @ I = @ I + 1

END

Then we first execute the following in the query Analyzer:

Set Statistics IO ON

Press Ctrl + M to display the actual execution plan.

Now, we can start our experiment. To accurately observe the changes in each SQL statement, we first clear the data memory occupied by SQL Server before executing the first SQL statement:

CHECKPOINT

GO

DBCC DROPCLEANBUFFERS

This will clear the data buffer occupied by SQL Server (this statement is used with caution on the production Server, because it will lead to slow execution of subsequent SQL statements within a period of time ).

Test 1: Execute SQL statement 1.1 on a table without indexes to select a full table or select a table with low selectivity.

Select * From P_User

The SQL Execution Plan shows that Table Scan is generated because the Table does not have any indexes. IO statistics are as follows:

(1000 row (s) affected)

Table 'P _ user '. 1 scan count, 4 logical reads, 4 physical reads, 0 pre-reads, 0 lob logical reads, 0 lob physical reads, and 0 lob pre-reads.

Let's take a look at the situation in the database memory.

First, query the database_id of the database we operate on:

Select database_id From sys. databases Where name = 'testgdb'

Then use the database_id to view the memory in the table:

SELECT * FROM sys. dm_ OS _buffer_descriptors bd

WHERE database_id = 5

Order by allocation_unit_id, page_id

The result is as follows:

The result shows that four Data_Page pages appear in memory, except for the required management pages (one PFS_Page and one IAM_Page. This is the same as the result in I/O statistics: logical reading is 4, physical reading is 4. Because the entire table is read, it indicates that the number of data pages occupied by all data in the P_User table is exactly 4. The total number of row_count data rows on these four data pages can also be verified as 1000.

In the above example, if you do not clear the data buffer and execute the SQL statement again, you can see that the memory has not changed, but the logic read remains unchanged, but the physical read is changed to 0, because you no longer need to read data from the disk.

1.2 perform highly selective Selection

In addition, if no index is available, modify the preceding example:

Select Top 1 * From P_Order or Select * From P_Order Where limit Eno = 28502

As you can see, the system also needs to read all data pages to the memory.

If you Select Top 1 * From P_Order Where limit Eno = 28502, you may read only part of the data pages to the memory. However, because no index is available, the data is stored in the stack unordered, and the result is unstable. It is also possible to read all data pages to the memory.

Test 2: When a clustered index is created, run SQL statement 2.1 to select a full table or select a table with low selectivity.

Modify the table structure and create a clustered index on the MobileNo field. Then execute the SQL statement again. The execution plan is changed to clustered index scan. IO statistics message:

(1000 row (s) affected)

Table 'P _ user '. 1 scan count, 6 logical reads, 1 physical read, 4 preread, 0 lob logical reads, 0 lob physical reads, and 0 lob preread.

Here, logical reads are changed to 6 times.

The memory is as follows:

In the memory, a non-leaf clustered index page is added, and the leaf-level clustered index is put together with the data.

In addition, you can view the index level of the table:

SELECT database_id, object_id, index_id, index_level, page_count, record_count

FROM sys. dm_db_index_physical_stats

(DB_ID (N 'testgdb '), OBJECT_ID (N 'dbo. p_user'), NULL, NULL, 'detail ');

The result shows that the clustered index of the table is divided into two levels.

Therefore, logical read is increased by 2 -- (because Clustered Index Scan occurs, except that the Clustered Index page at the root level occupies 1 time, from the root-level clustered index to the leaf-level clustered index, it also occupies an additional 1 logical read ).

Another change is that only one physical read occurs, that is, reading the clustered index page at the root level, and loading the other four data pages into the memory Buffer through pre-read instead of physical read from the disk. This makes SQL Execution less costly when clustered indexes exist.

2.2 perform highly selective Selection

When a clustered index is created, the following changes are beneficial to the performance:

For statements such as Select Top 1 * From P_Order or Select * From P_Order Where distinct Eno = 28702, only the page of the final record is read into the memory when clustered indexes exist.

Test 3: When a non-clustered index is created, run SQL statement 3.1 to select the entire table or select the table with low selectivity.

If you replace the clustered index of the same field in the table with a non-clustered index, you can see the following features:

Executing a full table scan is similar to not having any indexes. All data pages are read to the memory. In this case, the SQL Server query engine cannot use non-clustered indexes.

3.2 perform highly selective Selection

The page where only the final data is located will be read to the memory. The query plan shows that SQL Server uses INDEX SEEK on non-clustered indexes, and then obtains the row where the data actually resides through lookup (except when the INDEX overwrites, because you do not need to locate the actual data rows ).

Test 4: Execute Nested Loop Join

Before testing, prepare another table and data.

Create Table P_Order

(UserStatus int not null,

Inclueno int not null,

Sid int Not NULL,

LastSubTime DateTime

)

Insert data:

Declare @ I int

Set @ I = 20000

WHILE @ I <30000

BEGIN

Insert Into P_Order

Select @ I % 2, @ I, @ i-19999, GetUTCDate ()

Set @ I = @ I + 1

END

As you can see, when a full table scan is performed, 10000 data entries in the table occupy a total of 38 memory data pages.

4.1 perform full table selection or low-selectivity Selection

Select * From P_Order

Inner Loop JOIN P_User B on a. Worker Eno = B. Worker Eno

For this highly selective choice, SQL Server does not execute Loop Join by default. Therefore, a mandatory join prompt is used.

If neither table has any index, we can see that:

All data pages of the two tables are loaded into the memory. The logical read cost is as high as 60 thousand times-every record in the P_Order table will be traversed in the P_User table.

When one of the tables has clustered indexes, although logical reads have been greatly reduced compared to the previous 60 thousand times, it still reaches 20 thousand times. In addition, the order of Join Operations has a great impact on query performance. The preceding table in the SQL statement is used as the external input of the join, and the following table is used as the internal input of the join.

When both tables have clustered indexes, logical reads still reach thousands of times (depending on the size of the final output data), but the comparison has been greatly improved. In addition, only the data in the table that needs to be output is read into the memory Buffer.

4.2 perform highly selective Selection

Run the following SQL statement:

Select * From P_Order

Inner merge JOIN P_User B on a. Condition Eno = B. Condition Eno

Where A. Required Eno = 28913

If neither table has any index, both tables perform a full table scan. Read all data pages to the memory. The overall logical read depends on the number of data pages in the two tables.

If a table has clustered or non-clustered indexes, Index Seek is executed for the table, and full table scan is performed for the other table. In the memory data buffer, a table is read-only into the data page of the final data, and a table is read into all data pages. Logical reading depends on the order of the table in the join and the number of data pages in the non-index table.

When both tables have clustered indexes, the logical read is the smallest, and each table has only two to three times. In addition, only the data to be output is read into the memory page. When both tables have non-clustered indexes, the logical read and memory resources consumed are similar.

Test 5: Execute Merge Join5.1 to select the entire table or select the table with low selectivity.

Execute SQL:

Select * From P_Order

Inner merge JOIN P_User B on a. Condition Eno = B. Condition Eno

If neither table has any index, both tables must be scanned. All data must be read into the memory page.

The logical reading is approximately equal to the total number of data pages in two tables. Temporary tables will be used during SQL Server processing.

Similar to the case where only one table has clustered indexes, SQL Server uses temporary tables during processing. And read all data pages to the memory.

If both tables have clustered indexes, although the data in both tables will be read into the memory page, the logical reading has been greatly reduced, it is equal to the total number of data pages in one table plus the number of data pages finally output. SQL Server does not need to use temporary tables during processing.

5.2 perform highly selective Selection

For such highly selective SQL statements, SQL Server will prompt that an execution plan cannot be generated.

Select * From P_Order

Inner merge JOIN P_User B on a. Condition Eno = B. Condition Eno

Where A. Required Eno = 28913

But you can execute:

Select * From P_Order

Inner merge JOIN P_User B on a. Condition Eno = B. Condition Eno

Where A. inclueno <= 28001 (Note: there are only two final results)

This is a low-selective statement with few final results. As described above, using netsted loop connection may be more efficient in this case.

Test 6: Run Hash Join6.1 to select a full table or select a table with low selectivity.

If two tables have no indexes and the join prompt is not specified, SQL Server uses hash join by default. For Join operations between two tables, if both tables have clustered indexes, SQL Server uses Merge Join by default.

Execute SQL:

Select * From P_Order

Inner hash JOIN P_User B on a. Worker Eno = B. Worker Eno

When hash join is used, all data pages are read to the memory regardless of whether the two tables have indexes. SQL Server uses a temporary table for processing. The logical reading is approximately equal to the total number of data pages in two tables.

6.2 perform highly selective Selection

Similar to the highly selective selection of merge join execution, it cannot be executed directly:

Select * From P_Order

Inner merge JOIN P_User B on a. Condition Eno = B. Condition Eno

Where A. Required Eno = 28913

However, you can execute a script with very few results as follows:

Select * From P_Order

Inner merge JOIN P_User B on a. Condition Eno = B. Condition Eno

Where A. inclueno <= 28001 (Note: there are only two final results)

However, the netsted loop connection is more efficient.

Test Summary

The main significance of this test is that by analyzing the specific memory changes combined with the execution plan, IO reading and other information, you can better understand the SQL Server SQL statement execution process.

In addition, it also verifies some experience gained by analyzing the IO reading and execution plans of SQL statements:

(1) When performing a single table query, if it is a highly selective query, a non-clustered index or clustered index should be created (non-clustered index is recommended, which is independent of data storage ). For low-selectivity queries, a clustered index is required.

(2) When executing a join query, if the final output results are few, nested loop join is suitable. If there are many output results, a clustered index is created, however, the query in merge join mode can achieve good performance. For hash join with lower performance, it is recommended to improve query performance by converting to merge join or nested loop join.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.